This article can also be found in the Premium Editorial Download "Storage magazine: Rethinking the way storage architectures are packaged and presented."
Download it now to read this article plus other related content.
Cloud storage, virtualization and the relentless growth of unstructured data have all contributed to a rethinking of the way storage architectures are packaged and presented.
A changing compute world -- where physical data center infrastructure is yielding to virtualized systems and clouds, and desktops and laptops are supplemented with mobile devices -- is challenging traditional computing paradigms and reshaping everything computer related, including storage architectures. While trying to fit into a virtual world, storage has also been tested by a relentless deluge of unstructured data, with voracious contemporary applications and services demanding more data storage capacity.
The search for storage simplicity
The complexity of SANs, their inherent lack of simultaneous data access to multiple hosts (requiring clustered file systems to orchestrate shared data access) and their management overhead -- from configuring zoning, LUN masking, virtual SANs and ISLs to provisioning LUNs to hosts -- quickly became an impediment to virtualized infrastructures and an even bigger obstacle when used as cloud storage. Block- and file-based protocols of traditional storage systems that have worked well with a limited number of hosts accessing data center storage across private links have proven inapt for the boundless connectivity requirements of mobile devices and a growing number of cloud services. Traditional storage systems are slowly adjusting to a changing computing and application landscape by adopting scale-out architectures, supporting HTML-based protocols and revamping storage back-ends to more efficiently support flash, but the pace of change has opened a window of opportunity for new storage architectures and vendors to challenge the status quo.
These emerging storage architectures are likely to shape the DNA of coming storage systems:
- Object storage
- Cloud storage
- Software-defined storage (SDS) and virtualized storage
- All-flash arrays
Unrestricted scalability, ubiquitous access, cost-efficiency, the ability to support custom metadata, and a security framework that safely supports multiple tenants and heterogeneous, dispersed clients are the key characteristics of an object storage system. Instead of files and blocks, the basic data elements of an object store are objects with unique identifiers and custom metadata. Unlike file-based storage with its hierarchical data structure, objects are stored in an easy to manage, virtually infinite object namespace.
Autonomous storage nodes that provide both processing and storage resources comprise an object store, and it can scale proportionally as nodes are added. To keep costs at bay, storage nodes are typically built with off-the-shelf commodity components, such as x86-based systems with attached JBODs, that are glued together by the object storage software's "secret sauce." Sophisticated data protection mechanisms of traditional shared storage systems are replaced by a multi-instance object philosophy that calls for storing copies of objects on multiple nodes, with the number of copies depending on service levels and the criticality of the data. Finally, object storage is accessed via HTML-based protocols such as Representational State Transfer (REST) that enable access to object storage from any device anywhere.
Object storage is best suited to storing vast amounts of unstructured data that needs to be readily available to a wide range of clients. It has become the preferred storage architecture of Web 2.0 applications and websites that deal with a high volume of unstructured data, from images and videos to any other file types. It's also finding its way into corporate data centers to help with the explosive growth of unstructured data that has overwhelmed traditional storage systems.
Object storage is unsuitable for structured data and transactional applications where traditional storage systems have excelled and will continue to dominate. While early object stores were proprietary (built by the likes of Yahoo and Google), established storage vendors have been offering object stores such as Caringo's Object Storage Platform, Dell's DX Object Storage Platform, EMC's Atmos, Hitachi Data Systems' Hitachi Content Platform and NetApp's StorageGrid.
For a storage platform to be considered cloud storage, it needs to be:
- Network accessible. Similar to object stores, cloud storage is typically accessed via Web protocols, such as REST.
- Shared. Shared, secure access across different clients with multi-tenant capabilities that allow sandboxing different tenants is expected from contemporary cloud storage.
- Service based. Cloud storage is consumed as a service and paid for based on usage.
- Elastic. It needs to dynamically grow and shrink as needed.
- Scalable. Cloud storage needs to dynamically scale up and down based on demand, without an upper limit.
The majority of today's cloud storage offerings are powered by an object store on the back end, so it's no surprise that object and cloud storage share many of the same characteristics. While object storage is storage infrastructure, cloud storage is a storage service. Cloud storage is available as a public service from companies like Amazon (S3) and Rackspace; it can also be deployed internally to service corporate departments and users, and charged back by usage. It can be deployed as a hybrid storage cloud that combines internal and external cloud storage. Its benefits are usage-based consumption, eliminating the need for storage infrastructure, the ability to dynamically adjust and scale to any storage demand, and unfettered access via Web protocols.
The security concern of handing off confidential and private data to an external storage cloud is still the main reason hindering cloud storage adoption. Akin to object storage, cloud storage is best suited for unstructured data and isn't appropriate for structured data or as a data store for transactional applications.
Software-defined storage and virtualized storage
Storage systems have for the most part been a combination of proprietary storage software running on storage vendors' custom hardware, with the infrastructure components optimized for their storage software stack. Software reuse has been limited, often even within vendors' own storage systems, and rarely ever across vendor boundaries. The move toward a virtualized infrastructure that started with server virtualization and has since extended to other areas, such as networking, is actively reshaping storage architectures. One of the main benefits of decoupling the software stack from the underlying hardware is the flexibility of being able to mix disparate platforms that may vary in size, capabilities, performance and price, depending on requirements.
Even though software-defined "everything" has recently caught the attention of the storage marketing machines, it has existed in various forms in the storage realm for a while. A virtual storage appliance (VSA), where the storage software runs on a virtual machine (VM) and is distributed as a VM image, is one example of software-defined storage. For instance, NetApp's Data Ontap Edge VSA no longer requires a NetApp filer, but its VM image runs on any server with the appropriate hypervisor, and it seamlessly integrates with other NetApp systems.
Today, VSAs are primarily deployed in remote offices and for use cases that don't merit hardware appliances, such as embedded applications and mobile military systems. "VSAs can be put directly into the cloud to enable elasticity at a low cost," said Val Bercovici, NetApp's cloud czar, citing another use case of VSAs. In general, the majority of object stores and cloud storage systems are following the SDS model, where the software stack runs on low-cost commodity components. Without question, the abstraction of storage software from the underlying hardware is a trend that will continue. Over time, standards like the Storage Networking Industry Association's Cloud Data Management Interface (CDMI), and frameworks provided by the likes of OpenStack and CloudStack will eventually enable interoperability between storage components from different vendors.
Solid-state storage has been a disruptive and game-changing storage technology. An order-of-magnitude faster than mechanical disks, NAND flash-enabled new storage designs are displacing expensive techniques like short-stroking to lower access times and improve I/O. Semiconductor based and void of mechanical components, NAND flash is positioned right between DRAM and mechanical disks, both from a price and performance perspective.
Many contemporary storage arrays now offer solid-state storage, either as cache or as a substitute for mechanical disks. However, very few all-flash arrays are available because of the relatively high cost of NAND flash and performance limitations of traditional storage arrays that are optimized for mechanical disks. Contrary to hybrid disk/flash arrays, all-flash arrays can support hundreds of thousands of IOPS and are used for very high-end applications where minimal latency and maximum IOPS are needed. All-flash systems are available from companies such as Nimbus Data, Pure Storage, Violin Memory and Whiptail.
"At present, all-flash arrays are mostly about high performance, [and they're] missing many enterprise features and maturity," said Mohit Bhatnagar, NetApp's senior director of flash products. "But within two to five years, reliability and capabilities, such as QoS, will be there."
Darwinism in the storage realm
Traditional storage arrays are far from dead, but they're evolving to support the requirements of a changing compute landscape that's fraught with cloud and mobile computing. The move toward scale-out architectures, an increased use of solid-state storage and adoption of new storage protocols are evidence of this transformation. File-based storage is trending toward becoming object storage and is already competing with object-based storage to power cloud services. Block-based storage will continue to be critical for structured data and transactional applications, but vendors of those systems are adopting scale-out back-ends and evolving their storage architectures to better cope with the requirements of NAND flash and other emerging semiconductor-based storage technologies. In the meantime, the storage world is full of opportunities for new vendors to emerge that are able to move more quickly and are willing to gamble on innovative and unconventional technology.
About the author
Jacob N. Gsoedl is a freelance writer and a corporate director for business systems.
This was first published in April 2013