For the last two decades, RAID (redundant array of inexpensive disks) controllers have ruled the storage world. RAID has been required for data protection in disk arrays. RAID schemes (RAID 0,1,6 10, etc.) reside on RAID controllers baked into disk arrays with many billions sold to date. But perhaps more important from the standpoint of making money, the RAID controller has also delivered differentiated value for storage vendors. Data copy and migration, snap shot, deduplication, and the list of controller-based functions goes on–all have been loaded on to the RAID controller.
It’s becoming increasingly clear that the traditional RAID controller is coming to the end of its life cycle, at least within the enterprise data center. Types of applications now common to the Web 2.0 community are now populating the enterprise data center–applications that require scalability into the petabyte range. Traditional RAID controllers start to show their shortcomings at this scale level. Drive rebuild times elongate to the point where RAID data protection is no longer protection.
We can argue (and I have) over how much longer the RAID controller will survive. For sure, it’s nowhere near dead and will continue on as the workhorse of the storage industry for some time. But its shortcomings are becoming increasingly obvious and are driving the creation of the next generation of storage devices. Indeed one of those devices is no “device” at all. Rather, it’s software running on a collection of commodity servers and server-attached disk, both traditional and solid state disk. Think of this new “device” as software-defined storage where all of the functionality is defined and delivered in software. So as a user, when you buy a software-defined storage device, you’re simply buying code. What you run it on is up to you.
MaxiScale is an interesting example of software-defined storage. MaxiScale’s FLEX storage platform runs on standard servers with SATA disk, and uses standard Ethernet interconnections. It is implemented as clustered nodes–servers plus disk. I/O performance and capacity scales linearly as processing nodes and disk drives are added to the cluster.
So the storage value-delivery model is decidedly different here. You as the user buy software and essentially roll you own array. But what else is different here? First, while the RAID controller is gone, the absolute requirement to preserve data is not. Data protection is also implemented in software.
Second, the system assumes that individual nodes within the cluster will go off line or fail for one reason or another. That’s OK. The FLEX storage cluster continues to function, perhaps at some degraded state for some period of time until the full cluster is restored. But the point is that once you power up the cluster, you can keep it running for years–decades if you want. Hardware is added and replaced without disruption. Software is upgraded without disruption. It’s perpetual storage.
Third, FLEX is an expression of the state of the art in single or global namespace file system technology. It’s this core technology that delivers the value-added storage services rather than the RAID controller.
MaxiScale is not alone in this emerging space. Other software-defined storage solutions include ParaScale’s cloud storage software and Symantec’s FileStore. Other traditional hardware and software players will follow with software-defined storage offerings in the coming months. Include database vendors in this space as well. Some will position their solutions as cloud storage, others as data protection and archival storage.
Will software defined storage replace traditional RAID storage? Not immediately. Not dramatically. But to me a new model is emerging. Scalability, hardware independence, and system longevity are the more compelling features when compared to traditional RAID-based storage arrays. But perhaps the most compelling feature will be an ability to buy big array performance and scalability at a fraction of the cost of big array RAID.