The undisclosed reasons why IT downtime is still an issue

The insurance market is pretty savvy when it comes to risk mitigation. Most organisations have an executive level decision maker in charge of strategical risk to ensure that the business is compliant and secure. Compliance and risk mitigation has been further driven by industry bodies since the collapse of the Lehman brothers in 2008 and the subsequent recession, so you’d think we’d be pretty good at it as an industry.

But what we’re mostly addressing here is financial risk – in simple term the balancing of lending and borrowing. There are other risks that can cause as much damage to businesses and leave them incapable of operating. A big risk to the insurance market is loss of their IT systems, and this is a risk that is rarely fully explored to its full potential.

Of course insurance companies have disaster recovery plans, and larger organisations have sophisticated technologies offering immediate failover and zero data loss. This is all great BUT can lure companies into a false sense of security and it’s only when a disaster strikes that the damaging impact of IT downtime is truly felt. And common thing is- it always takes companies by surprise just how difficult the recovery is. IT availability is therefore still a big risk to even the larger insurance companies and here are some of the reasons:

  1. Malware has become more sophisticated and replicating a virus means that both your production and DR systems are prone to failure. DR systems should contain a ‘last known good’ which has been fully tested to application level and can be relied upon should up-to-date replications be affected.
  2. If business critical systems involve using a SaaS provider, it will be the provider, not you that has control of the availability of systems and your data. You are at the mercy of their DR solution and, at worst you could lose everything should they be unable to perform business as usual, due to IT failure or financial troubles. This article is really thought provoking when assessing SaaS risk.
  3. We can’t emphasise enough the value on testing of your DR solution. The thing is, it’s ONLY when you have tested your full DR solution to application level (i.e. booted your recovery systems, tested the applications work and your data is running on them) that you can be sure you have a working DR solution. For most companies this is for 1 day of the year, something which is rarely driven by and examined by the executive board. Following your DR test, every change in your systems and fresh bit of data may not work in your recovery system when the day comes that you need it. Therefore, although Plan B sticks religiously to the best practice of daily testing to application level, this isn’t possible for everyone else so we recommend at least quarterly testing to application level. When was the last time you addressed this as a board? I bet for many it’s not within the past 3 months.
  4. DR providers often make big marketing claims, but can hide behind SLA’s. What is the comeback if they don’t meet these SLA’s? There is often no comeback and SLA’s are regularly missed due to the complexity of the recovery process. Maybe it’s time to put your providers to the test……..