Planning and management for a software-defined storage architecture
A comprehensive collection of articles, videos and more, hand-picked by our editors
There's so much talk about software-defined storage that the technology has the potential to get confusing. But truth be told, we can understand the storage application that software-defined storage vendors seek to separate from storage hardware pretty easily if we consider the basic architecture of a storage array today.
The storage array consists of solid-state or magnetic storage components, or both, organized in trays in a rack. Usually, these racks of drives are attached to a controller, which is more likely than not a PC motherboard running a commodity operating system such as a Windows or Linux variant. This operating system may run RAID software and other value-added software products that deliver services ranging from thin provisioning (a complicated mixture of resource monitoring, demand forecasting and capacity allocation) to compression and inline data deduplication, to various types of data protection services. Also provided on most business systems is a management and configuration utility running as a self-articulating Web interface or a service accessible via a command line interface or graphical user interface.
It's the price associated with software, rather than the hardware, that typically explains why storage costs as much as it does. For example, a popular deduplication storage array costs the manufacturer approximately $7,000 in hardware (all commodity parts), but the "value-added software" provided in the kit (for deduplication services) enables the vendor to charge an MSRP of $410,000 for the rig. Moreover, replicating data from this array requires gear with the same brand, make and model, and additional value-added software for synchronous or asynchronous replication at an additional cost.
In a non-software-defined storage (SDS) environment, the "storage application" is simply a volume created from hard disks or solid-state devices in the array -- usually by vendor system engineers at the time of equipment setup and configuration -- that includes functionality imparted to the volume via the value-added software services. More often than not, all volumes created from the physical array have the same set of value-added services, and each volume can be accessed via a single (or in fault-tolerant systems, redundant) pathway through storage infrastructure plumbing. Applications leveraging a volume must be configured to use the pathway to the resource and must be reconfigured to use a new route if the application is re-hosted on a different physical server.
All that explains why early server virtualization approaches required that SANs be broken up and storage returned to server-attached or server-internal configurations. In so doing, it was easier to associate physical storage resources to virtual workloads. To facilitate high-availability clustering, identical internal or DAS configurations were used with synchronous replication services between the storage on different virtualized servers. That way, the data required by an application would be available at identical coordinates wherever the application was hosted.
A consequence of this model, however, was a spike in storage capacity demand. Analysts projected year-over-year increases in storage capacity ranging from 300% to 650% in highly virtualized server environments. This was costly and unsustainable.
The alternative is to keep the SAN infrastructure where it has already been built and to simply virtualize it -- or more simply, to move the storage application off of individual array controllers and into a storage hypervisor or storage virtualization server. This enables routes to volumes to move with virtual machines as they transition from one physical server to another. In the process, rerouting to the same volume containing the data is accomplished "behind the scenes" by the storage virtualization engine. This method further enables services to be provided on virtual volumes in a more granular fashion, fitting more precisely the various needs of differing workloads or guest machines.
Is this the storage application sought by SDS? Most engineers would agree that it is: It's a way to provide to an application a persistent storage volume with sufficient capacity, performance and appropriate services in an agile way. Unfortunately, vendor marketing folks like to make highly nuanced distinctions between definitions that have less to do with storage virtualization technology than with efforts to proffer an SDS offering that works with only one vendor's server hypervisor or hardware kit.
In the final analysis, the storage application is the construct that is provided to applications for use in reading and writing data. The physicality of the volume -- the disk drives that make up the resource and the pathways to the resource -- are masked from the view of applications and end users. If you're already experimenting with, or have deployed, server virtualization technology, you know what you need to know about storage virtualization and the storage application.