Looking at the latest storage models from hypervisor vendors and their minions in the software-defined storage world, it amazes me that they seem to be learning little or nothing from their predecessors. They are proof positive that forgetting historical mistakes leads to repeating them. Only, in this case, storage consumers are the victims.
There is nothing inherently wrong with software-defined storage (SDS). In fact, the fundamental idea has a lot of merit: move all of that overpriced value-add software off the array controller (more often than not, a motherboard) and onto a server where it becomes, well, a software-based uber-controller.
I have railed against embedding value-add software on the array controller here and elsewhere for a long time. So these days, it is nice to see a company like IBM taking XIV, which was originally a software-only product that they joined to the hip of a proprietary controller to create the XIV array, and returning it back to its original form -- casting it as an SDS product in its Spectrum family.
Unfortunately, software-defined storage vendors are mostly building SDS controllers the same way "hardware-defined" storage controllers were designed -- first and foremost for proprietary operation, so they work only with specific workloads. Second, software-defined storage vendors are excluding capacity management from the suite of storage services available to users.
The first point, proprietary design, comes from the same dark place that it always has: the vendor's desire to lock in the consumer and lock out the competition -- also known as the quest for world domination.
Simply put, VMware really doesn't want its Virtual SAN to be used to host the data of non-VMware virtualized workloads.
On the other hand, Microsoft actually doesn't want Clustered Storage Spaces to host alien workloads either, but they are a bit more politically adept at explaining their position. "Yes, you can store your VMware virtual machines on my SDS infrastructure," you might hear someone from Microsoft's headquarters in Redmond, Wash. "We provide this handy utility that converts your VMDKs (Virtual Machine Disks) to VHDs (Microsoft Virtual Hard Disks) so they can occupy space alongside all of our virtual machines. Pretty soon, we will provide converters for those Xen XML files too!"
The problem with proprietary design
In the view of the folks in Redmond, the world is not a zero-sum game, but an expanding universe of limitless colonial potential.
The point is that proprietary controller design didn't go away with SDS any more than it did when proprietary direct-attached arrays were replaced by SAN-attached arrays or NAS appliances. Proprietary designs will have the unfortunate consequence of perpetuating additional unappealing aspects of the past, such as problems with array heterogeneity, interconnect rigidity and common infrastructure manageability.
We are already seeing proprietary "hyper-converged" appliances appearing in the market that support only one hypervisor. Even diehard DIYers must comply with the hypervisor vendor's approved component list or "pre-certified" nodal hardware kits when building a SDS infrastructure -- so much for storage for the common man. Heck, the baseline requirement of hypervisor-vendor SDS definitions is a three-node cluster. Numerous third-party SDS vendors show that available storage can be had with only two nodes, but the big software-defined storage vendors say start with three… And soon four.
However, the biggest bugaboo is the rejection of capacity management or capacity abstraction as a component of the SDS tool set. Why would we "software-define" everything from thin provisioning to deduplication to compression to mirroring, but not include capacity abstraction, which is a fundamental function of any array controller?
Abstracting capacity can help…here's how
If we abstract capacity, storage can scale at will and targets can travel with VMs as they move from host to host. Heck, with virtualized storage, we are able to allocate pools of storage to workloads in a very elastic way (growing and shrinking capacity as workloads require) and we can pre-configure the pools with the speeds, services and protection levels that the data needs.
This would bring together storage service management and storage capacity management in a manner that would simplify every admin's life. I look at the work that is being done in infrastructure management right now by thought-leading vendors like SolarWinds and realize that it could be so much more effective at orchestrating as well as reporting on infrastructure resources and services if storage capacity management was abstracted. Orchestration is the hobgoblin of the whole software-defined data center vision. Without it, we are no more efficient at provisioning resources and services to workloads than we were before the whole server virtualization craze began.
Abstracting capacity and its management, of course, would make a lot of storage hardware vendors (and now hypervisor software vendors) very unhappy. What do you do when management services extend to anyone's SDS, hyper-converged and legacy infrastructure? How could one software-defined storage vendor convince anyone that its product is superior to another's? My goodness, companies would need to cart out test data and let consumers talk openly about the performance they are getting. The horror!
In a world where everything is always-on, a world in which hypervisor vendors claim we need active-passive clustering with failover for every app, a world in which business IT will be judged by metrics like agility and resiliency and availability (as if those terms actually mean anything), you would think that we would want some sort of always-on data infrastructure, too. Instead, the hypervisor vendors are brewing up something that smells a lot like the infrastructure they spent so much time and effort to demonize.
A closer look at the SDS market
SDS isn't a new idea
Buyer's checklist to software-defined storage