Continuous Data Protection (CDP), as the cutting edge of always essential but normally staid backup technology, had a big ’05. Witness the entry of EMC, HP, IBM, Microsoft and Symantec into a space previously populated by startups, a string of funding and product announcements from those startups, and the establishment of a new SNIA CDP special interest group (SIG).
This year promises to be just as interesting, with products from the major players both increasing the visibility of CDP and making the technology more approachable for enterprise IT.
There are a variety of approaches to CDP, and, as fits a term with such cachet, it is applied to radically different products. It’s too early to pick a winner from among these approaches, but it’s not too early to spot a trend toward increasing integration of CDP, both with protected applications and with other storage technologies.
A Big Umbrella
Defining CDP is tricky, mostly because the inclusiveness of the term is the source of some contention in the industry. In dispute is the ”continuous” component. Is the ability to recover from arbitrary points in the continuum a defining characteristic of CDP?
SNIA provides the following definition: ”Continuous data protection (CDP) is a methodology that continuously captures or tracks data modifications and stores changes independent of the primary data, enabling recovery points from any point in the past. CDP systems may be block-, file- or application-based and can provide fine granularities of restorable objects to infinitely variable recovery points.”
”Infinitely variable recovery points” would seem to require no fixed recovery points.
But the term CDP has also been used more broadly to refer both to products that allow restoration at the granularity of single write operations and to products that offer restoration only from specific points in time.
The two most prominent examples of the latter approach are Symantec’s Backup Exec 10d and Microsoft’s Data Protection Manager (DPM), which provides what Microsoft refers to as ”near-continuous data protection”. Each of these record and replicate changes continuously, but snapshot data on the server only at one-hour intervals, so historic data is available only at these increments.
”I think customers are a bit confused over the difference between a lot of snapshots and continuous data protection,” says Enterprise Strategy Group analyst Brian Babineau. Babineau says both approaches have a place: snapshot and replication for less mission-critical applications that can tolerate a small window of data loss, and true CDP for applications where a zero or near-zero RPO is required.
Part of the confusion arises from trying to group these two very different segments under the same umbrella. Replication and snapshot solutions do not provide arbitrary recovery points, but they are aimed at solving general backup and recovery problems at the departmental and SMB level. In contrast, true CDP aims to provide zero data loss for targeted critical applications in the enterprise.
”There are those who are saying you have to be able to dial back to any arbitrary point in time,” says Michael Parker, group product marketing manager at Symantec. ”From our discussions with the mid-market, that’s not exactly something they are looking for.”
In the mid-market, CDP products provide automated, regular backups and can be part of a shift to an overall disk-to-disk-to-tape strategy. For these customers, says Parker, ”The nature of data protection is changing. People are rethinking how they do backup.”
Despite all the buzz, deployments are limited, at least at the enterprise level. ”People are just starting to get their toes wet in this technology,” says Babineau. ”They haven’t deployed it full-fledged across the entire application infrastructure.”