Midland Heart is a leading housing organisation, delivering affordable homes to rent or buy as well as services across the Midlands that enable people to live independently. Founded in 1925, it is a trusted not for profit organisation that owns and manages 33,000 homes for 70,000 customers.

With a complex architecture of over 100 virtual machines (VMs), Zerto was the ideal replication software for Midland Heart’s disaster recovery solution. On its own however, it is no silver bullet. Zerto requires deep understanding and experience of its strengths and limitations to maximise the benefit of it, and guarantee full protection for Midland Heart.

In theory, replication software is simple to manage, but in practice there are challenges to overcome – such as bandwidth limitations and the high rate of data change prevalent with every business. Because production systems are a constantly changing environment, green ticks on the Zerto management console don’t necessarily mean that you have a DR platform that you can rely on.

With these real-world constraints, Midland Heart engaged Plan B, a Zerto managed service specialist, to help ensure they get the best out of their disaster recovery investment and ultimately guarantee that it will meet all business objectives when the time comes to use it.

Three specific areas of Plan B’s experience have already proved to overcome challenges that would have otherwise caused the replication software to be ineffective.

1.    BANDWIDTH

All continuous replication software requires a significant amount of dedicated bandwidth to keep the DR environment up to date. In our experience, this means around 8Mbps per 1TB of data protected. However, this is based on a ‘typical’ average data change for servers, something that one of Midland Heart’s servers certainly wasn’t. On one occasion, a single server’s rate of data change had a material impact on the rest of their DR servers, which meant that the replication was permanently in a state of ‘bitmap sync’. This resulted in none of the servers being replicated, leaving Midland Heart potentially having no recovery position at all.

Without sufficient bandwidth to overcome this ‘bitmap sync’ quickly, it can appear to have the effect of a motor boat with a maximum speed of 5.1mph fighting a current of 5mph. Progress is painfully slow and difficult to predict. In the absence of being able to increase bandwidth, the RPO can increase to many days until the delta sync completes.

Understanding this problem, and conscious that the Zerto journal could lose its ‘last known good’ position, Plan B temporarily increased the journal length, at no additional cost, to safeguard Midland Heart’s position and protect its ability to recover.

Plan B subsequently recommended changes to the dedicated bandwidth at Midland Heart and agreed temporary measures that could be taken to re-allocate increased bandwidth should a similar situation arise in the future. All machines have since had enough bandwidth to replicate fully.

2.    CHANGE CONTROL

Production systems are continually changing. Changes to dependencies, operating systems upgrades, new applications and resource allocations would all impact a DR environment, which is inextricably linked to the production system.

As part of our management service, Plan B discovered that some changes made in production by Midland Heart had a profound effect on the DR environment without displaying any errors on the Zerto console. Plan B’s enhanced testing identified errors that the Zerto console didn’t, as we test the DR system to the application layer. Together with Midland Heart, these errors were resolved immediately, and their DR solution was subsequently certified as working. A green tick by Zerto only means the data has been copied, not that the DR environment will provide the service. This is why it’s so important to test your DR platform to application level, as the Plan B Managed Service does on a daily basis.

In light of this, Midland Heart amended their change control process to include considering whether changes to the production environment could have an impact on the DR environment and if so, discuss with Plan B the best way to update the DR environment. This will ensure that their DR capability remains in a state of readiness at all times.

Conversely, Zerto’s reverse replication facility means that the DR environment can accidentally change the production environment. Knowing they have experts familiar with the software has given Midland Heart peace of mind that this risk is minimised.

3.    MONITORING

Knowing that your priority 1 systems have the lowest RPO (data loss) is essential, but data changes and maintenance can have a big effect on the currency of those data sets. Plan B’s unique daily RPO monitoring service assesses a virtual protection group’s (VPG’s) current and worst-case RPO within the last 24 hours, identifying problem areas vulnerable to overnight batch jobs or data dumps. The valuable daily assessment ensures that Plan B and Midland Heart work together to proactively identify problem areas created by the production environment and prioritise fixes before they impact recovery performance, keeping RPO’s to a minimum.

WHY CHOOSE PLAN B?

Plan B offers extended functionality to Zerto’s continuous replication technology including advanced daily testing of DR systems to application level and proactive daily monitoring by senior engineers. Free of charge support is available 24x7x365, and our recovery guarantees form the basis of our commitment that your DR will work every time.

In just a few months, Plan B has already added value to our business in managing our Zerto disaster recovery solution for us. Having invested in a DR solution to ensure that our IT systems are resilient, we expect it to work without complications. Plan B has already identified areas where intervention was required to ensure ongoing protection against downtime. Their testing service offers a unique depth of testing that cannot be matched, and the expertise of their engineers has meant that we have complete peace of mind that any problematic areas will be quickly address and rectified. We know that our IT systems have the highest level of resilience available because our DR is proactively managed by Plan B.

Chris Ratcliffe, IT Services Manager