A long-term Total Cost of Ownership analysis of Cohesity’s DataPlatform in a secondary data environment.
When considering storage technologies most of the attention is focused on primary storage, the systems supporting production databases, customer-facing applications, critical analytics, etc. The majority of storage systems in use today were designed for these primary use cases. But there is a secondary group of files and data objects that dwarfs these primary data sets in storage capacity, one that also has a very different set of characteristics driving storage requirements. New storage solutions, like Cohesity, were developed specifically for this use case, with features and functionality designed to address the special challenges of storing “secondary data”. This paper will examine the Cohesity DataPlatform with two other scale-out NAS systems comparing their costs in a secondary data environment.
The research presented here, in the form of a total cost of ownership (TCO) analysis, shows that enterprise storage administrators can cut the cost of scale-out NAS significantly when using Cohesity’s DataPlatform. Cohesity is a little more than half the cost of one leading NAS solution and less than one- third the cost of another. When the timeline is extended and cumulative cost is calculated, Cohesity saves more than $2M over seven years (see figures 4 and 5 below).
The Cohesity DataPlatform has a distributed cluster architecture that scales to hundreds of nodes and is designed to function as a consolidation point for a variety of data types and workflows. It has a number of specific features that help address the challenges of storing secondary data and reduce its total cost of ownership.
Every company has its primary production applications and data sets. These are the systems that generate the revenue and drive day to day functions of the organization. Secondary data are the copies of primary data that are created to support data protection, testing and development, but also include users’ files, multi-media content, logs, etc. In fact, secondary data typically comprises up to 80% of the total storage capacity of the enterprise. And the growth of secondary data can be insidious, as multiple data sources, often on separate storage systems, quietly expand.
Silos of Storage
Data protection is typically a major part of the IT infrastructure and is discussed below. Other secondary data sources are frequently captured and stored on dedicated systems, “point solutions” that are designed and deployed for that purpose. There are plenty of examples in most enterprises, such as the storage systems that support test and development teams, often part of an infrastructure that’s deployed, operated and expanded independently from core IT. End user and multi-media files are additional examples, as are log files and other metadata from production and operations programs.
Companies archive data to save money on primary storage or provide more protection, driving the purchase of file or object storage systems. These archives can also support data analytics as companies look to pull more insight out of their existing data. All of these distributed “silos” present multiple points of management, they are less resource-efficient and can complicate the scaling process, compared with a consolidated storage system.
Management and Visibility
Obviously, deploying, maintaining and scaling multiple storage silos creates more work for IT personnel than running a single system. Often these are purpose-built appliances that have their own management GUIs, update schedules and expansion processes. And besides the operational overhead, these systems do not provide uniform visibility into the data itself. This reduces IT’s ability to maximize resources and limits the analysis of the data itself.
These systems create their own copies of data, often multiple copies based on the use case, the obvious one being data protection. But backup is just the beginning, test and development generates copies, users keep files that their peers also have, and logs and other metadata that are continuously generated, can be saved for long periods, or forever. Controlling the creation of these duplicate files and data objects can result in a significant reduction in storage consumed. Copy data management is the term used to describe this process, but it’s more difficult when these data aren’t stored on the same system.
Space saving processes like deduplication can also suffer in a multi-silo environment. The effectiveness of the complex comparisons they use to reduce redundancy are a function of the size of the data sets they are run on. Deduplication running on a single, consolidated storage system can produce much better results than each silo running its own dedupe.
Download now to read the full report.