Backup is a tough job, one that touches essentially all of a company’s important data – and duplicates much of it. To make this job easier and more efficient, backup technologies have changed a lot over the years with snapshots, deduplication and disk-based appliances replacing tape drives and tape libraries. But backup is still a dedicated process, one that’s essentially overhead, and one that requires a separate infrastructure and management. Hyperconverged appliances have simplified the traditional IT infrastructure, eliminating discrete components and improving operational efficiency in the data center. Now this technology is starting to incorporate data protection, providing a way to simplify or even eliminate backup as a dedicated process.
In the corporate world, backup used to mean tape drives and tape libraries. Then “disk to disk” backup replaced the real-time streaming of data to tape and the delay when that data was restored from tape. Deduplication made it feasible to store backups on disk long term, giving rise to the backup appliance and making the backup tape library obsolete.
The cloud has created another option for backup, essentially a target for storing the files copied from the primary data systems. It has allowed companies to outsource their remote backup infrastructure and stop physically moving backups off-site. The cloud has also made disaster recovery a lot easier to implement with the advent of Backup-as-a-Service (BaaS) and DR-as-a-Service (DRaaS).
Snapshots Changed the Game
Snapshots are also called “point in time copies” of a given volume or data set, but that’s a misnomer. Instead of making actual copies of data, which would take too long and consume storage space, snapshots essentially make copies of the metadata “pointers” or references to the data itself. These pointers can be saved at any point in time, almost instantly, since the time required to copy metadata is basically nothing compared with copying the actual data volume.
Snapshots enable the state of a given data set be preserved without impacting the users of that data, essentially eliminating the disruption of physically copying a volume to the backup data store. With some additional technology to efficiently handle the changes that accumulate between snapshots (changed block tracking) and a process to create cloned copies, snapshots can provide the foundation for an improved data protection process.
By encapsulating an entire server instance in a few files, virtualization has made the backup process much simpler as well. Instead of dealing with the thousands of discrete files on a server, backing up a virtual machine involves only a few. And when combined with snapshots, that backup can be captured in real time, as often as necessary. Enabling individual files to be pulled out of a restored VM adds back some complexity, but on balance it’s a worth the effort, given the frequency of backups vs restores. Cloud-based backups also benefit from virtualization as VMs can be restarted in the cloud to provide an attractive DR option.
Self Protecting Data Systems
When paired with clones and replication, snapshots can greatly simplify traditional data protection. Hyperconverged appliances incorporate these technologies into the same system giving rise to the concept of a data system that can provide its own backup. For the workloads running on them and their associated data sets, HCAs alleviate the burden of running traditional copy-based backup. For more on this topic see this Industry Insight report from the Evaluator Group.
Many products have long lists of features that sound the same but work very differently. It’s important to think outside of the checkbox of similar-sounding features and understand how technologies and products differ.