carloscastilla - Fotolia
Published: 03 Aug 2016
It was only a few short years ago that we started talking about converging compute, storage and networking onto the server. Today, hyper-convergence is alive and well and, as a data center infrastructure model, shows no signs of slowing down. The dust has barely settled on this topic, however, and we are already talking about yet another new data center paradigm: Disaggregation.
What is this "disaggregation," and how does it relate to what we are only now beginning to understand about convergence and hyper-convergence? To do justice to this topic, I need to start at the beginning.
The Moore's Law phase
Presented by Intel's Gordon Moore in the mid-sixties, Moore's Law led to the doubling of microprocessor performance every 18 months or so and, essentially, to the development of PCs and powerful workstations and servers. It also fueled the client-server movement.
Moore's Law progressed until we started to reach the limits of physics and clock speeds. Consequently, the industry's emphasis shifted away from clock speeds and over to multicore processors over the past decade or so. The logic: If we can't make a single processor go any faster, then let's gang them up on the same chip and make them act as one. Multicore processors, in turn, led to the development of powerful servers that were the genesis of server consolidation using hypervisors.
So what does all this have to do with disaggregation, you ask? One more step and I'll get to that.
The aggregation phase
The movement toward multicore processors meant that compute power became incredibly cheap and available. When you think about it, the software-defined movement was born out of this abundance of available compute power.
In storage, we first separated the control plane from the data plane, which -- in essence -- meant we could control a variety of hardware (the data plane) using a software-based control plane. But given these new powerful processors, why not integrate the data plane in software, too? In short, multicore led directly toward the movement of incorporating all storage functionality of the data center in software and running it all on the server. Enter products like DataCore (an early pioneer in this concept), EMC ScaleIO, HP StoreVirtual, VMware Virtual SAN and others.
With the advent of sophisticated virtualization technology for both storage and server, along with great advancements in file systems technology and all this excess compute power, why not take matters a step further and implement multiple functions directly on a server or cluster of servers?
This is how the vision of hyper-convergence took root, with new powerhouses Nutanix and SimpliVity and, more recently, EMC ScaleIO-based VxRack, Pivot3 and others fighting for a place at the hyper-converged table. Scale Computing, another pioneer in this space, focuses on SMBs and, on the big data side, Hadoop vendors embrace the idea of scale-out nodes that also run both compute and storage on the same server nodes.
With massive amounts of compute power, it became easy to assign a specific function to run on an individual core for predictable performance, even though many other functions were running on the same computer, per se. Unquestionably, the well-known public clouds are all based on this premise.
We can think of convergence/hyper-convergence as the "aggregation" phase for the computing industry. It harkens back to the age of mainframes, but in a much more commoditized and scale-out fashion.
The disaggregated phase
Just as we are settling in on this idea of convergence, the industry now seems to be moving toward the concept of disaggregation. The idea is not to go back to individual silos of functionality (that would be horribly painful), but rather to create pools of functionality across nodes.
Take, for example, one of the main issues we hear regarding hyper-convergence: the inability to buy compute power (for applications) separate from storage capacity. Granted, vendors are now offering nodes that are heavily tilted toward compute or capacity, but what about memory or just flash for caching or another resource? Is there a way to add only what you need?
This is what led to the idea of disaggregation.
Infrastructure would still scale-out and be "nodal," but you would only add the functionality you want when it's needed, and that functionality would be pooled across nodes. This means all memory, flash, compute power and HDD capacity would be pooled and made available to all applications.
Networking functionality is only now being virtualized. Springpath, for instance, distinguishes itself in the hyper-convergence market with its virtualized network facility. The next step would be to disaggregate this virtualized network facility so that just the right amount can be purchased to serve all applications from a common pool. Meanwhile, network security functions such as firewalls, VPNs and AV filters are already being virtualized and disaggregated in the form of network functions virtualization platforms, which can be selectively applied as needed in a company's network.
On the surface, hyper-convergence and disaggregation appear to be orthogonal to one another, but in my view, disaggregation is simply an evolution of the concept of hyper-convergence. It allows for the creation of more powerful web-scale clouds where massive pools of resources are available to be used and shifted based on need or policy across all applications. With disaggregated computing, utilization rates would soar compared to where they are today -- reducing the need for space, power and cooling.
Vendors of disaggregation
Three vendors are worthy of note in this discussion. PernixData focuses on disaggregating flash capacity across nodes. It obliterates the 1-to-1 relationship PCIe SSDs have with the server in which they are housed, presenting flash as a pooled resource instead. The company is now doing the same for memory. PernixData unquestionably pioneered the concept of disaggregation.
The second vendor is Datrium. I think of their product as a server-powered storage. Yes, they only deliver storage functionality and do not belong in the converged or hyper-converged category. But what Datrium does is unique. They disaggregate all storage functions (like IOPS, RAID, thin provisioning, data protection, cloning and snapshots) that used to reside in the proprietary storage array and split them to run on application servers (remember all that spare compute power available there) and much simplified storage arrays. By using the spare compute power of the application servers, the functions that run on the storage array scales as you add more application servers.
This is the reverse of what happens with traditional storage arrays, so the storage array becomes much more cost-effective and manageable. For those who want to stay with the traditional compute-network-storage paradigm, Datrium technology promises to deliver a storage array today that's designed for tomorrow.
The third vendor, DriveScale, focuses entirely -- for the moment -- on the Hadoop space. Instead of buying a thousand nodes, each with compute and DAS storage, DriveScale enables customers to buy compute and storage separately based on what's needed by applications. This solves the number one issue we're hearing from Hadoop users, the inability to disaggregate compute from storage when scaling up. I expect DriveScale will disaggregate memory at some point in the future as well.
Creativity has no bounds in our industry. Constant evolution is the norm. So just as you were getting comfortable with the idea of convergence and hyper-convergence, it's time to add another new term to your vocabulary: Disaggregation. Stay tuned for a lot more in this exciting, fledgling area of data center computing.
About the author:
Arun Taneja is founder and president at Taneja Group, an analyst and consulting group focused on storage and other infrastructure technologies.
Disaggregated servers and converged IT can work together
Why disaggregation makes sense … and cents