Plan B now carries out over 300,000 Disaster Recovery (DR) tests every year for our customers. Sounds like a lot? Well, that’s because it is. We carry out more DR tests than any other company because it’s not until you’ve tested your Disaster Recovery solution that you can be certain that it is going to work. And so that’s why we run a full DR test for every customer every single day (and because we have a money-back guarantee on it working for our customers so we can’t afford to be caught short!)
But here’s the clever bit, because if you’ve ever run a DR test with your company then you’ll know that it takes a lot of effort, and a lot of time. It can in fact add up to being quite a costly exercise just to run one of these a year, which is what most companies do – so how can we afford to run 365 a year and make our service affordable? The answer’s in automation. It’s quite an obvious answer yet a very complex process to automate.
Firstly, in order to run a test on your DR solution, we need to perform an entire recovery. So, you’ve guessed it, we perform a full recovery of your IT systems every day. And, of course, to minimise costs, this process is also automated……we’ll try and explain how this works from the beginning.
Plan B’s technology captures snapshots of customers’ systems and instantly transfers them to our Virtual Recovery Platform. Once here, we turn them into complete, working architectures that do not require any manual intervention before they can be useful. Every machine must be fully functional in its new environment, as must every application contained within those systems. This means that they often require a number of changes or ‘overlays’ before they will work on the Recovery Platform. Take the domain controller for example, once we have a copy of the customer’s live domain controller, we may need to carry out some, or all of the following actions before we are happy that it will work on a new platform:
- P2V overlays
- New network topology, new IP address ranges
- Appropriate Active Directory (AD) recovery mode selected
- AD replication links to non-local DC’s disabled
- AD site modification to support DR site
- DNS records changes to correspond to new network addresses
- Host files and other static config files appropriately amended
There may be up to 15 separate changes required for this one machine, which Plan B automates. On top of that, different servers require different changes, so Plan B automates all of these for every server, so they are performed to every fresh copy we take of a customer’s server. This is the process every other company would need to go through following an IT outage to recover their IT systems and get themselves back on track.
So, back to the testing, we now have all machines built, with overlays and changes made, on our Virtual Recovery Platform. In effect the recovery part is done. But it needs to be tested before it can be handed out to employees to work from. So we test it – every machine, and every application. We run between 10 and 20 tests on every server every 24 hours to make sure every dependency between systems is mapped, and that end-user applications will function as required. For the domain controller we test it becomes ready, that the network configuration is correct, DNS is functioning, DNS entries for local hosts are correct, and that there are no unexpected errors in the event log. That’s a lot of tests. We call his whole process for each customer just one DR test, and if any part of it fails then we go back to the root cause of the failure and fix it. Now that’s clever (well we think it is). It enables us to offer money-back guaranteed on your recovery working within a set number of minutes.
But you want to know the best part? We have expanded our testing to cover best of breed technologies like Zerto and Veeam. But we’ll talk about that next time, that’s enough for today…….