Building fire takes down servers and restricts access to them for 48 hours
It is fairly uncommon, but when a fire rips through your building it has such severe consequences to the operation of your business, that the scenario needs to be well planned for.
Our customer, a global eCommerce company with offices worldwide, experienced such an event when their head office caught fire. At around 19:30 on April 15th the building fire started in the electrical riser and soon took hold, destroying all electrical infrastructure and causing fire and smoke damage to 4 floors and destroying part of the building’s roof. It took the fire brigade 12 hours to get the fire fully under control, and they didn’t allow anybody inside the building for over 24 hours.
Our client was faced with a situation where they knew their main offices would be damaged with no power and therefore be unusable for business purposes. They didn’t, however, know the full extent of the damage. Critically from an IT point of view they didn’t know whether their servers had survived and – if they had – what state they would be in. Without the Plan B service in place they would not have been able to make any progress towards a recovery until the Fire brigade allowed access to see what the damage was, over 24 hours later.
With over 400 staff in their UK head office, our client’s initial challenges were about getting their IT services working again, organising alternative offices and organising staff to move to the new systems and facilities.
Disaster Recovery Solution
As part of their IT Disaster Recovery provisions, Plan B protects this customer’s core communications and business information systems with our certified recovery service. The systems protected include email, core file servers, CRM systems, customer help desk systems, ticketing information systems, HR and finance systems. They have a separate approach for protecting their internal software development systems based on rebuilding them from backups if necessary. The IT Disaster Recovery Plan forms part of their overall business continuity plan.
Our service guarantees that the customer’s IT systems will be available within twice the minutes it takes to boot their live system. This guarantee is possible because we recover their IT systems every day and test them to application level, certifying they will work within this timeframe. During a recovery, we simply need to turn on their recovery system.
Our client phoned Plan B’s emergency contact line at 21:45 that evening, correctly passed security checks, explained the situation and requested a full service invocation.
Within ten minutes of the first call from our client, Plan B engineers started the recovery process. Once started, the standby systems only had to boot up on one of Plan B’s remote rescue platforms. All systems were up and running within 35 minutes.
The Plan B system also automatically brings the rescue servers up in a correctly configured network with secure user access already set up and ready to go. For this client we also configured their Internet ‘DNS’ addresses to automatically repoint so both web systems and email addresses would quickly switch to the recovered servers.
So within about 45 minutes of our client deciding to invoke the Plan B service, their IT staff were able to login to their recovered IT systems again. This provided instant access to core company information for staff and customers and the communications systems to start contacting both. Outbound emails were working immediately and inbound emails started arriving within the hour.
This process all worked exactly as expected and was uneventful. Our client purely had to make one phone call to get the system recovery going and wait 35 minutes before being able to login remotely from any internet connection. The simplicity of this approach brought clarity and order to what might otherwise have been the start of chaos, providing our client’s staff with the information and tools they needed to start handling the business recovery.
Because the fire happened in the evening our client’s IT staff were able to use the night to organise themselves and warn other staff that they would need to implement their elements of the business continuity plan. Staff contact details were all available on the recovered systems so initial phone calls could be made to brief staff for the morning and emails could be sent with specific instructions.
An initial email was proactively sent out over night to all customers telling them of the fire, impressing them that the situation was being handled and providing them with contact details for more information.
Plan B staff were available throughout the night to support our client but very little was required while they handled the organisation of their business for the first day after the fire.
By 10am on Tuesday 16th of April – around 12 hours after first calling Plan B – the client had 120 of their core staff logged into the recovery systems from home. This meant they could communicate with customers and carry on all non-software development activities. Externally this looked like the business was operating in most respects.
Having got their core IT systems working again, our client’s next challenge was alternative office space. As part of a larger group their business continuity plan dictated looking for space within group offices and in parallel looking for space via local serviced office providers. By the end of Tuesday they had managed to find room for about half their teams in other regional offices and got agreement from Plan B’s partner – Regus for a short term rental of space in a local office for the remaining 200 staff. Plan B were able to provide secure network connectivity to both the Regus and the regional offices to get local access working again.
Resumption of remaining software development systems
The Fire brigade eventually allowed our client access to their original office on the evening of Tuesday 16th April. It turned out that their IT servers were all contaminated with smoke but appeared to work and were not otherwise damaged. As the Plan B service had instantly provided a replacement of the core systems solving the immediate IT problem, their IT team were now able to calmly evaluate their options and take strategic decisions about how best to get their other systems working and how to start planning a return to their local live service. Level3 already provide the client with their core networking so the obvious location for the servers was in a Level3 facility. That was arranged by Thursday 18th and Plan B provided a software copy of an enabling server to make the service work in the new location. For speed this was done by hand and the image was exchanged on a USB hard drive at a junction on the M4 in a scene reminiscent of a spy movie. As the core systems were being provided successfully from Plan B’s Rescue platform the client prioritised their Software Development servers and started lifting and shifting them to Level3 in Slough with a plan to then run a rolling programme of having the servers professionally cleaned in batches. The last key challenge was to securely network the Plan B recovery systems into the new servers in Level3 and to provide onward links to our client’s new Regus provided office, the regional offices and their international offices. This was a reasonably tricky piece of networking but because the core connection had been created immediately by the Plan B system on invocation, our client and Plan B staff were able to take time to think the requirements through and implement the correct solution. By Friday 19th our client started to get their Software Development servers back in action and with the help of the weekend had absolutely everything (85 Servers) working by Monday 22nd April.
Longer term support
After the initial recovery Plan B moved into a longer term ongoing support role. To guard the client’s data we implemented a data backup service for the systems Plan B were providing giving an onsite and offsite copy. Next we re-implemented the DR service to another Plan B recovery platform – effectively giving the client back DR protection for the live systems being run by Plan B. So within a week our client had all services working fully and all protection back in place.
The recovery of the client’s systems and therefore the recovery of their business went so well that they were able to get their staff working again within a few hours, and had absolutely everything back and working fully within 7 days of a total facilities and systems loss. According to their Chief Operating Officer “the reaction from customers was amazingly positive” and several congratulated them on how professionally they handled the situation.
Extended migration off the recovery platform
One of the great advantages of the Plan B service is that it solves the initial IT problem immediately enabling the Customer’s staff to get back to work and allowing the IT staff to think strategically about rebuilding the services for the long term. They can therefore properly plan and implement the migration rather than have to simply make tactical decisions based on whatever is fastest. In our client’s case they decided that they would take the opportunity to upgrade and rebuild some of their systems as part of the planned migration of services back from the Plan B recovery platform. As a result the migration took nearly nine months and they were back in their completely refurbished offices two months before the last email servers were moved back to their own systems.
How well did the recovery go?
We at Plan B consider the recovery of our client’s core servers to have gone exactly as we would hope. We believe the nature of the Plan B service enabled them to quickly get back their business back to work and to make better decisions and take faster actions to guard the long term interests of the business. However, the real test is how their business fared through the incident.
The client stated that despite the fire and all the ramifications that followed our client hit all their financial targets for 2013 and from a commercial point of view the fire had no effect on their long term performance.