Disaster recovery (DR) in the cloud sounds like an attractive proposition but it can also be a confusing one.
There are so many options: Which cloud providers should I use? Should I use a public cloud’s DR service or a third party who provides DR on the public cloud? Should I use DR as a service or manage my own?
How much is this going to cost me now and next year when my storage has grown 50% or more? How fast can I restore? How do I restore my applications from the cloud and not just my data? Should I go with a managed service or maintain more control?
Public clouds abound with their own DR options, as do DR vendors who use public clouds as an option for their customers. Placing DR in the cloud can allow smaller companies who never had a DR program to start one, and allow companies who use expensive remote sites to lower those heady DR expenses.
Before we talk about the usages of cloud-based DR, let’s be clear what it is. And isn’t.
· DR in the cloud is not backup and restore. DR includes backup but is not limited to it. DR is not simply restoring data; it’s restoring systems and applications as well as data. There are a number of vendors who use “DR” as a short hand for backup but know what you’re getting.
· DR in the cloud is not simple. Cloud-based DR is highly scalable and can save a lot of money over remote DR sites. It can also be complex with differing service levels, failover and failback, and testing. There are many points of decision within the DR process depending on several factors: RTO and RPO service levels, volume of protected applications and data, and budgeted costs. Managed DR or DR as a Service (DRaaS) is the simplest method but as always you get what you pay for. Be certain that the DR service is not simply backup and recovery.
· DR in the cloud is not just for the enterprise. In fact, SMB and mid-sized may be more likely to benefit from cloud DR on their smaller scale. Enterprise DR usually encompasses many sites and may include regulatory requirements and critical applications that affect many thousands of people. For this reason, the enterprise has been slow to adopt cloud DR outside of pilot projects and smaller departments.
Disaster Recovery and Virtualization
Cloud-based DR uses the cloud to recover applications – not just data -- in a fully operable state. Virtualization is the key to the DR process, as it is in on-premise virtualized environments replicating to DR sites. The hypervisor encapsulates the entire computing stack into a VM. With no hardware dependencies, server images are easily copied for DR purposes and spun up as needed. With dedupe and compression added on, an already fast process is even faster. (Note that you can use cloud-based DR for physical systems using P2V conversions.)
There are two operations that you are primarily looking at for cloud-based DR: replication and application consistency. Replicating server images to the cloud is easy enough. It can also be expensive and time consuming, and can cause significant latency. How much and how frequently you replicate depends on the protection priority of the applications whose DR is entrusted to the cloud.
A high priority replication type is near real-time that copies changes to the VM as they are written. This had high bandwidth and processing costs but may be justified for critical applications. Synchronous replication is another choice for applications that can absorb slower replication with no data loss, but is rare with the public cloud due to geographical limitations. The most common replication method for cloud-based replication is scheduled snapshots that usually occur anywhere from 15 minutes to 24 hours. With the replicated VMs on the cloud, IT can restore entire virtual machine on-premise, and/or can manually spin up replicated VMs, or automate the process with cloud-based failover.
Application consistency is the second key feature for cloud-based DR. Consistency between primary and replicated VMs is crucial when recreating the production environment, whether you are restoring replicated VMs on-site or using cloud-based failover and failback.
Figuring out What you Need for Cloud DR
Before defining any DR services, identify and prioritize your applications. Even if you go with a managed DR service you will need to know what applications are best protected in the cloud and what level of service to pay for.
Assign both RTO and RPO to your applications and their data. What application can wait 4 hours to be restored but cannot afford data loss? What application can be rolled back to a 30-minute recovery point but RTO is under an hour? What applications need to come up the first 12 hours and others 36 or more?
Do not expect cloud-based DR to replicate large applications back to the on-premise data center within a few minutes, or to capture near real-time RPOs. This requires a much closer geographical connection than will happen with anything other than an on-site private cloud, even if your practice nearby geo-location.
Now that you know the DR levels to assign various applications and service levels, you can talk to cloud vendors to see which combinations of services offer the most operational and budgetary sense. You will have options even if you confine your DR to the public cloud. All of them offer DR options and most of them support third party DR offerings. For example, VMware vCloud Air (yes, a public cloud) provides cloud-based failover using asynchronous replication in VMware vSphere VMs. VMware also partners with outside replication vendors like Veeam. Interestingly enough, VMware also partnered with Google Cloud Platform to offer object storage in the VMware cloud.
Amazon AWS supports a variety of DR options with internal services and other vendor DR services running on AWS. Anyone can backup to S3 but Amazon Elastic Compute Cloud (EC2) has the most impact on true DR services at AWS.
Google is a latecomer to cloud-based DR with its release of Google Nearline. Google put the new technology together with its Compute Engine service to build a disaster recovery Infrastructure as a Service (IaaS). Google provides the infrastructure to DRaaS providers like Geminare.
For all of these challenges, and with the understanding that is not for everyone, cloud-based DR can be a good choice for a smaller company, or for specific applications at larger companies. It can give a company a DR program that never had one, or it can save money over remote DR sites. It can also make it easier to assign differing service levels to different applications.
No matter how much or little you decide to use cloud-based DR, you will benefit from prioritizing applications and service levels. This process will improve your data protection and disaster preparedness no matter where you base your DR processes.
Photo courtesy of Shutterstock.