IT disaster recovery solution isn't up to scratch
South Oxfordshire District Council has sadly become the victim of the latest IT disaster, following a fire which broke out at their offices on Thursday 15 January. Reports that a car carrying gas canisters crashed into the foyer of one of the buildings on site mean that it is being treated as suspicious.
So we all think it won’t happen to us, and this level of severity of IT disaster is admittedly extreme, however IT downtime even on a smaller scale will undoubtedly hit you at some point so it’s best to be prepared. Having worked with a private sector customer through a severe fire which has led to business disruption we have seen it first-hand. In our instance the client’s IT systems were back at 100% productivity within 40 minutes due to good planning and preparation; but the council, seemingly without such a robust and effective IT disaster recovery plan, is still encountering loss of systems 4 days later, with the expectations that they will last another week.
Currently, as of 14 January, 4 days after the incident, the website has just gone live again, and communication around when the public can start seeing services restored has commenced. Parts of the website, including online applications and reporting, are still not working, and more worryingly, some of the telephone numbers are still not working. No planning applications are being considered, and only emergency enquiries are being responded to. This loss of IT systems has left the council severely hampered in its ability to not only operate, but to communicate effectively. If this was a private business, the loss of sales and customers, as well as reputational impact would have a huge impact on revenue by now. So what’s likely to be happening inside the council right now from our experience?
Staff must be re-sited to either work from home of from a recovery site, or alternative council offices. This takes time to arrange, with hardware and connectivity required for each member of staff, and so although staff may have a place to work by now, they won’t have access to working systems yet, and so productivity will be significantly hampered. The IT department will be busy trying to build a working set of systems before installing the most recent data from backup– starting with the most critical of services (communications, website and email likely to be prioritised). Next the applications that enable operations to be carried out will be rebuilt– financial, workflow, CRM, ERP systems all need restoring and reconfiguring. Resilience of this new system needs to be considered and finally individuals need their devices configured to work on the new systems. All highly stressful for all employees at the council – whether it’s the IT department (who will carry the large burden of responsibility over how quickly the IT systems are recovered), the HR department who will see knock on effects over the working environment, employees who are just trying to do their jobs, or the leadership team who will be fending off press and pressures from above whilst trying to lead their team through the recovery process.
For now, the council, responsible for rubbish collection, recycling, council tax collections, housing and planning applications, stands to encounter difficulties reconciling payments and fulfilling basic services, which will ultimately lead to a backlog of public complaints and enquiries – increasing the workload for the team for months after the incident. The longer the IT systems are down, and productivity is limited, the more severe the aftermath.
The consequences are severe. The council expect to take 11 days to resume services fully. Therefore, the impact is likely to be felt for around 3 months if not longer. This is far too long. Even with adequate planning and testing of in-house disaster recovery solutions, critical services should be resumed within 24-48 hours, which implies this type of incident is not properly planned for. We see the need for an assessment of their disaster recovery plan, with the minimal output to be more regular testing of their disaster recovery solution. By carrying out full exercises the IT team will be much more confident when handling an IT disaster, and recovery will be much faster.
What are the opportunities?
With modern technology, the council could be resuming 100% productivity of their critical services within an hour and although many think this it budget prohibitive, cloud solutions like Pre-recovery make this feasible. If budgets don’t allow then a more simple Veeam online backup solution with independent DR experts working on the recovery would have made the return to service much faster. It may be that this incident causes not just South Oxfordshire District Council to rethink their disaster recovery solutions.
By Beth Baxter