The Value of Disaggregated Hyper-Converged Infrastructure – ITPro Today Blog by Eric Slack

By , Monday, November 12th 2018

A disaggregated hyper-converged infrastructure approach delivers more flexibility and resource efficiency over traditional HCI models, but carries a few drawbacks. Learn about the pros and cons and compare disaggregated HCI products.

A hyper-converged infrastructure (HCI) is composed of a cluster of nodes – industry-standard servers with internal storage devices (drives) — that are combined into a shared pool of storage capacity. While node configuration can vary between models and vendors, each node can run workloads, using local CPU and memory resources, on that shared storage pool.

Recently, a new “disaggregated” architecture has become available for HCI, one that offers some advantages over existing designs. In this article we’ll go into more detail about disaggregated architectures, about the HCI products on the market that use this technology and what it means for IT professionals considering an HCI solution.

HCI nodes are self-contained modules that make setup and expansion easier, since there’s essentially no configuration or integration involved. Sizing the initial cluster and adding storage capacity is done in node increments. This simplicity is one of the big benefits of HCIs, as well as one of the drawbacks. Adding nodes for more storage capacity also adds compute resources, and vice versa. This is where a disaggregated architecture can help.

What Are Disaggregated HCIs?

Disaggregated HCIs have dedicated storage and compute nodes. Each storage node has a CPU and some memory, but only enough to facilitate communication and essential data movement between nodes. Likewise, compute nodes don’t have more storage capacity than is required to support local workloads. This allows clusters to be created with only the resources needed and enables users to scale them in a more granular fashion. There are some other benefits as well, which we’ll discuss below.

HCIs have been compared with hyper-scale architectures, the infrastructures developed by the large cloud and social media companies that replaced traditional server and SAN infrastructures. But the disaggregated architecture is actually more similar to what the hyper-scalers developed than the fixed-node configurations of traditional HCIs. It supports the kind of flexible scaling and resource efficiency that was essential for their business models.

Disaggregated HCI Vendors

The vendors that offer HCIs with disaggregated architectures are Datrium (DVX), Dell EMC (VxRack FLEX and VxFLEX Ready Nodes) and NetApp (HCI). (For more information, see Product Briefs for each product in Evaluator Group’s Evaluator Series Research.)

Datrium DVX. A Datrium DVX cluster is made up of from one to 10 data nodes that provide a highly available persistent data store for the compute nodes (each cluster can support up to 128 compute nodes), handling data protection and resiliency functions, with 12 hard disk drives or solid-state drives (SSDs) in each 2U chassis. DVX compute nodes store active data for local VMs in a flash cache (up to eight SAS or NVMe SSDs). Each compute node is a 1U, 16- or 28-core Intel Xeon server that handles VM and IO processing and sends persistent data writes to the data nodes. DVX systems allow a “mix and match” of DVX compute nodes with compatible third-party x86 servers, whether new or already installed in a customer’s environment.

NetApp HCI. NetApp HCI is a turnkey compute and storage solution based on the Element OS software that runs SolidFire scale-out, all-flash arrays. NetApp HCI clusters contain a minimum of four storage and two compute nodes, but can expand to contain 100 nodes in any combination. Four nodes fit in a single 2U chassis (half-rack at 1U each). Storage nodes come in a single configuration with six SSDs, in three capacity points, and compute nodes support dual CPUs with three core and memory configurations. In addition, NetApp HCI offers 1U storage expansion nodes with up to 12 SSDs and compute expansion nodes with two GPUs and 16 CPU cores.

Dell EMC VxRack FLEX and VxFLEX Ready Nodes. The VxRack FLEX combines Dell PowerEdge servers (R640 and R740xd models) and ScaleIO software-defined storage with networking components and management software to create what Dell EMC calls a “rack-scale hyper-converged” infrastructure. ScaleIO was originally developed and sold as a software-only software-defined storage solution. In addition to a “two-layer,” disaggregated configuration, where storage and compute nodes are separate servers, VxRack FLEX software can be deployed in an “HCI” configuration, where both storage and compute functions are run on the same node. VxFLEX Ready Node clusters run ScaleIO software on approved third-party server hardware and don’t include networking or management software.

To Disaggregate or Not

As previously discussed, disaggregation can provide some significant advantages over traditional HCIs, the most obvious being flexibility and resource efficiency. Separating storage from compute functions into dedicated nodes gives these systems the ability to configure an HCI cluster more closely to the workloads it’s running and to scale storage independently from compute.

Unlike traditional HCIs, disaggregated clusters run the software-defined storage layer on storage nodes, not on the compute nodes, freeing up CPU cycles for compute functions. And adding more compute nodes enables the cluster to be configured for higher performance. These HCI solutions can be a better fit in larger data center consolidation use cases, where cluster capacity, performance density and efficient scaling are important.

Another advantage for disaggregation is the ability to separate refresh cycles on storage from compute. New SSDs will come out with higher capacities at intervals that are not necessarily in sync with the upgrade cycles of CPUs. This means that when a traditional HCI node is updated, one of these resources probably isn’t using the latest technology. Disaggregated HCI vendors can release new storage nodes independently from compute nodes, enabling them to more closely follow the upgrade cycles of SSD suppliers and of Intel.

Along these same lines, the independence of storage and compute makes “as a service” storage pricing options more feasible for HCI vendors. By separating these resource updates, they have more flexibility to choose the most cost-effective version of SSD or CPU and reduce the financial risk of offering various subscription-based price guarantees.

On the downside, most disaggregated clusters have larger minimum node counts, making them less attractive for smaller environments. They’re also somewhat more complex to deploy than traditional HCIs, especially when they include networking components.

Users and use cases do vary, but, overall, disaggregated HCI architectures will probably offer more pluses than minuses. The ability to scale higher and more efficiently, in both storage and compute, will be attractive to a lot companies. For vendors, this technology may enable HCIs to make inroads in the data center and support popular subscription-based pricing.

While disaggregated architectures are becoming more prevalent in the HCI space, this isn’t a feature that can be easily incorporated into an existing product. It’s a foundational characteristic of the software. Current HCI vendors wishing to add a disaggregated product to their lineup will most likely need to roll out a new product using a software-defined storage solution that has this capability.