It may seem as if storage technologies are a little stodgy and out of date, but there’s plenty of technical development going on at both big storage vendors and smaller upstarts.
The enterprise data storage industry doesn’t have a reputation as a hotbed of innovation, but that characterization may be unfair. Although bedrock technologies like RAID and SCSI have soldiered along for more than two decades, new ideas have flourished as well. Today, technologies like solid-state storage, capacity optimization and automatic tiering are gaining prominence, and specialized storage systems for virtual servers are being developed. Although the enterprise arrays of tomorrow will still be quite recognizable, they’ll adopt and advance these new concepts.
Spinning magnetic disks have been the foundation for enterprise data storage since the 1950s, and for just about as long there’s been talk of how solid-state storage will displace them. Today’s NAND flash storage is just a decade old, yet it has already gained significant traction thanks to its performance and mechanical characteristics. Hard disk drives (HDDs) won’t go away anytime soon, but NAND flash will likely become a familiar and dependable component across the spectrum of enterprise storage.
Hard disks excel at delivering capacity and sequential read and write performance, but modern workloads have changed. Today’s hypervisors and database-driven applications demand quick random access that’s difficult to achieve with mechanical arms, heads and platters. The best enterprise storage arrays use RAM as a cache to accelerate random I/O, but RAM chips are generally too expensive to deploy in bulk.
NAND flash memory, in contrast, is just as quick at servicing random read and write requests as it is with those that occur close together, and the fastest enterprise NAND flash parts challenge DRAM for read performance. Although less expensive, flash memory (especially the enterprise-grade single-level cell [SLC] variety) remains an order of magnitude more costly than hard disk capacity. Growth in the deployment of solid-state drives (SSDs) has slowed and isn’t likely to displace magnetic media in capacity-oriented applications anytime soon.
Flash memory has found a niche as a cache for hard disk drive-based storage systems. Caching differs from tiered storage (see the section on “Automated tiered storage”) in that it doesn’t use solid-state memory as a permanent location for data storage. Rather, this technology redirects read and write requests from disk to cache on-demand to accelerate performance, especially random I/O, but commits all writes to disk eventually.
Major vendors like EMC Corp. and NetApp Inc. have placed flash memory in their storage arrays and designed controller software to use it as a cache rather than a tier. NetApp’s Flash Cache cards use the internal PCI bus in their filers, while EMC’s Clariion FAST Cache relies on SATA-connected SSDs. But both leverage their existing controllers and expand on the algorithms already in place for RAM caching.
Avere Systems Inc. and Marvell Technology Group Ltd., a couple of relative newcomers, take a different tack. With a history in the scale-out network-attached storage (NAS) space, Avere’s team developed an appliance that sits in-band between existing NAS arrays and clients. “No single technology is best for all workloads,” said Ron Bianchini, Avere’s founder and CEO, “so we built a device that integrates the best of RAM, flash and disk.” Bianchini claims Avere’s FXT appliance delivers 50 times lower access latency using a customer’s existing NAS devices.
Marvell’s upcoming DragonFly Virtual Storage Accelerator (VSA) card is designed for placement inside the server itself. The DragonFly uses speedy non-volatile RAM (NVRAM) as well as SATA-connected SSDs for cache capacity, but all data is committed to the storage array eventually. “This is focused on random writes, and it’s a new product category,” claims Shawn Kung, director of product marketing at Marvell. “DragonFly can yield an up to 10x higher virtual machine I/O per second, while lowering overhead cost by 50% or more.” The company plans to deliver production products in the fourth quarter.
EMC, famous for its large enterprise storage arrays, is also moving into server-side caching. Barry Burke, chief strategy officer for EMC Symmetrix, said EMC’s Lightning project “will integrate with the automated tiering capabilities already delivered to VMAX and VNX customers.” EMC previewed the project at the recent EMC World conference and plans to ship it later this year.
One common driver for the adoption of high-performance storage arrays is the expanding use of server virtualization. Hypervisors allow multiple virtual machines (VMs) to share a single hardware platform, which can have serious side effects when it comes to storage I/O. Rather than a slow and predictable stream of mostly sequential data, a busy virtual server environment is a fire hose torrent of random reads and writes.
This “I/O blender” challenges the basic assumptions used to develop storage system controllers and caching strategies, and vendors are rapidly adapting to the new rules. The deployment of SSD and flash caches help, but virtual servers are demanding in other ways as well. Virtual environments require extreme flexibility, with rapid storage provisioning and dynamic movement of workloads from machine to machine. Vendors like VMware Inc. are quickly rolling out technologies to integrate hypervisor and server management, including VMware’s popular vStorage API for Array Integration (VAAI).
Virtual server environments are an opportunity for innovation and new ideas, and startups are jumping into the fray. One such company, Tintri Inc., has developed a “VM-aware” storage system that combines SATA HDDs, NAND flash and inline data deduplication to meet the performance and flexibility needs of virtual servers. “Traditional storage systems manage LUNs, volumes or tiers, which have no intrinsic meaning for VMs,” said Tintri CEO Kieran Harty. “Tintri VMstore is managed in terms of VMs and virtual disks, and we were built from scratch to meet the demands of a VM environment.”
Tintri’s VM-aware storage target, isn’t the only option. IO Turbine Inc. leverages PCIe-based flash cards or SSDs in server hardware with Accelio, its VM-aware storage acceleration software. “Accelio enables more applications to be deployed on virtual machines without the I/O performance limitations of conventional storage,” claims Rich Boberg, IO Turbine’s CEO. The Accelio driver transparently redirects I/O requests to the flash as needed to reduce the load on existing storage arrays.
Not all data storage innovations are focused on performance. The growth of data has been a major challenge in many environments, and deleting data isn’t always an acceptable answer. Startups like Ocarina and Storwize updated existing technologies like compression and single-instance storage (SIS) for modern storage applications. Now that these companies are in the hands of major vendors (Dell Inc. and IBM, respectively), users are beginning to give capacity optimization a serious look.
Reducing storage has ripple effects, requiring less capacity for replication, backup and disaster recovery (DR) as well as primary data storage. “The Ocarina technology is flexible enough to be optimized for the platforms we’re embedding the technology into,” said Mike Davis, marketing manager for Dell’s file system and optimization technologies. “This is an end-to-end strategy, so we’re looking closely at how we can extend these benefits beyond the storage platforms to the cloud as well as the server tier.”
Data deduplication is also moving to the primary storage space. Once only used for backup and archiving applications, NetApp, Nexenta Systems Inc., Nimbus Data Systems Inc., Permabit Technology Corp. and others are applying deduplication technology in arrays and appliances. “NetApp’s deduplication technology [formerly known as A-SIS] is optimized for both primary [performance and availability] as well as secondary [capacity-optimized backup, archive and DR] storage requirements,” said Val Bercovici, NetApp’s cloud czar. NetApp integrated deduplication into its storage software and claims no latency overhead on I/O traffic.
Automated tiered storage
One hot area of innovation for the largest enterprise storage vendors is the transformation of their arrays from fixed RAID systems to granular, automatically tiered storage devices. Smaller companies like 3PAR and Compellent (now part of Hewlett-Packard Co. and Dell, respectively) kicked off this trend, but EMC, Hitachi Data Systems and IBM are delivering this technology as well.
A new crop of startups, including Nexenta, are also active in this area. “NexentaStor leverages SSDs for hybrid storage pools, which automatically tier frequently accessed blocks to the SSDs,” noted Evan Powell, Nexenta’s CEO. Powell also said that his firm’s software platform allows users to supply their own SSDs, which he claims reduces the cost of entry for this technology.
EMC has added virtual provisioning and automated tiering across its product line. “EMC took a new storage technology [flash] and used it to deliver both greater performance as well as cost savings,” said Chuck Hollis, EMC’s global marketing chief technology officer. “Best of all, it’s far simpler to set up and manage.”
Like caching, automated tiered storage improves data storage system performance as much as it attacks the cost of capacity. By moving “hot” data to faster storage devices (10K or 15K rpm disks or SSD), tiered storage systems can perform faster than similar devices without the expense of widely deploying these faster devices. Conversely, automated tiering can be more energy- and space-efficient because it moves “bulk” data to slower but larger-capacity drives.
Innovation in storage
Enterprise storage vendors must maintain compatibility, stability and performance while advancing the state of the art in technology -- goals that may sometimes seem at odds. Although smaller companies have been a little more nimble at introducing new innovations like capacity optimization and virtualization-aware storage access, the large vendors are also moving quickly. They’ve put into service solid-state caching and automated tiered storage, and are moving forward in other areas. Whether through invention or acquisition, innovation is alive and well in enterprise storage.
BIO: Stephen Foskett is an independent consultant and author specializing in enterprise storage and cloud computing. He is responsible for Gestalt IT, a community of independent IT thought leaders, and organizes their Tech Field Day events. He can be found online at GestaltIT.com, FoskettS.net and on Twitter at @SFoskett.