Plan B highlights key questions to ask when buying disaster recovery for your business
Whilst mid to enterprise size organisations have plenty of experience when buying disaster recovery solutions, the SME market can find it a real challenge. Larger companies have access to research and procurement resource and are more versed in what to look for when it comes to functionality. CIOs / IT Directors of large organisations are granted access to a wide range of digital sources such as InnovationScouts.tech . Peer groups and membership bodies help keep IT professionals abreast of emerging technology.
In contrast it is harder for business owners and directors of the SME market to keep up with innovation. With a wider job remit it is nigh on impossible to keep up to date with how to protect their data and IT systems to ensure continuity of service. Understandably this is the end of the market more likely to suffer from business interruption.
Here Plan B highlights key questions to ask when buying disaster recovery for your business. These 3 crucial questions will make the difference between robust protection for your business and an accident waiting to happen. We give you the insights that we have spent years understanding to ensure that you don’t have to endure days of downtime with huge financial impacts to your business. And if you still need help, just give us a call and we’ll happily to support you to make sure you make the right decision.
How to dig behind the marketing
You’ll probably notice that most DR providers talk about 3 key features: speed of recovery (low RTO), data loss (low RPO) and uptime. But what do these really relate to?
- Speed of recovery (RTO) for some providers means the time from when you log the support call to the time a server is back up and running, BUT this can be without the applications working, data mounted, servers working as a unified system or users being able to work effectively on the systems. For other providers it means the time to when users can log on and work as if they did prior to the failure. There can be hours (even days) of difference in productivity between these two very different meanings of RTO so be wary of the statement ‘instant recovery’ and what the ‘recovery’ element actually means to them. Our recovery time guide explains this fully. A good question to ask is whether recovery time means return to full service?
- Data loss (RPO) can be described as ‘zero data loss’, meaning if your systems fail, you will be able to restore every bit of data up to when your systems failed. Most replication tools achieve this by starting another replication as soon as one has finished (commonly called continuous replication) BUT this length of time is highly dependant on how much data you have and your bandwidth available. If you don’t have dedicated bandwidth for this then a replication that is supposed to take 7 seconds may take 2 hours. This means you stand to lose up to 2 hours of data if you experience an outage. You’d be wise to explore the conditions under which RPO can be ‘near zero’ and see whether you meet them. It’s not unfair to ask a provider to look into your bandwidth requirements and advise accordingly.
- Uptime. This is generally used by your hosting companies who will talk about 99.99% uptime. Hosting companies often offer a DR solution which will replicate your data within the same environment BUT as we all know, even the big players – AWS and Azure – have experienced downtime and it’s important to know how they will recover you should this happen. If they are experiencing a complete datacentre outage then they will need to recover every customer to a different datacentre. How will this work, who gets recovered first and how long will it take for your hosting to be relocated? I would ask about this process and whether there are any guarantees attached to recovery times.
Understand what really goes on during the recovery process
Depending on the nature of the failure, DR providers will need to recover files, VMs or entire systems. Whilst most companies can pretty quickly recover files (from backup retention) and VM’s (from a replication tool, failover can be done at the press of a button), recovering entire system sets is trickier. A big outage can take down virtual and physical machines, and your IT systems will have many networking dependencies in order to ensure they work together. This is where a good DR service will differ from an average one.
Your average provider will recover your servers following a failure and then set to work on a lengthy configuration of making them all mount in the correct order. They will suggest recovery times (RTO) of roughly 4 hours which will rarely be met in a complex recovery which physical and virtual machines. The better disaster recovery providers will understand these dependencies – as in the diagram below, automate the recovery process, run it daily and test it in advance of a failure. This means that at the time of an incident, your users can be back to work immediately without the time lapse of an average provider.
You should clarify with a provider how much of the work is done in preparation in advance of a failure, and what work needs to be carried out post failure. This will give you a better view of the recovery process.
What does the testing include?
Again, thanks to marketing, most backup and DR providers cite ‘daily DR testing‘ as a USPs. But look harder at what they are really testing. Is it testing that the data has been replicated or is it testing that the DR platform is operational, that the applications mount correctly and that the the data is restored correctly with everything booting up in the correct order? Our guide to DR testing gives an insight into what your DR provider should be testing.
With disaster recovery, the success of recovery is all in the preparation. The more a provider does in advance to understand your dependencies, prepare the recovery and test, the better your chance of a fast recovery without data loss. If all of this is happening following a failure it can lead to a highly stressful and time-consuming experience which, in this day and age, is totally unnecessary.
If you’re buying disaster recovery then exploring these 3 areas will help reduce your risk of lengthy outages, and the subsequent impact to your business.
Plan B is a specialist Disaster Recovery provider with a European presence. If you’d like a free consultation with us to discuss your requirements of go over any of the above please don’t hesitate to get in contact with us and we’d be happy to help.