Rich Castagna, Todd Erickson, Ed Hannan, Sonia Lelii, Dave Raffo, Carol Sliwa and Sarah Wilson
Published: 05 Dec 2013
Six data storage technologies -- nex-gen solid-state, primary storage dedupe, hyper-converged storage, backup appliances, OpenStack and cloud-integrated storage -- will impact your shop.
If you've read one of our technology prognostications before, you know the drill: we don't pick pie-in-the-sky projects as our hot data storage technologies. Rather we focus on the new, and newish, storage techs that we think are poised to have an impact on your shops in the coming year.
That said, some of our predictions are about storage technologies that have only recently emerged from R&D labs, but they bear so much promise that we think they will weigh in immediately. That's the nature of the storage market today: Technologies that used to take years to evolve and gain a following are topping the charts in short order these days. Case in point: solid-state storage's meteoric rise.
In fact, the ever-developing flash storage is featured in this year's predictions, with two solid-state techs -- Non-Volatile Memory Express (NVMe) and 3-D flash -- about to spring into prominence. Rounding out our 2014 predictions are the arrival (finally!) of dedupe to primary storage, hyper-converged storage-plus-everything-else systems, plug-and-play backup appliances, the rise of OpenStack storage, and hybrid technologies that blur the line between cloud and on-premises storage.
Next-generation solid-state storage
Solid-state storage has taken the industry by storm. Startup companies and legacy system vendors alike offer hybrid solid-state/disk-based systems, all-solid-state drive (SSD) arrays and server-based flash. But despite its promising start, obstacles to continued solid-state development are popping up, including a lack of industry standards for network interoperability and the physical limitations of current NAND flash technology.
The NVMe Work Group, an industry consortium of more than 80 technology companies, is developing an industry-standard PCI Express (PCIe) host controller interface to optimize how PCIe flash devices interact in storage systems.
"It standardizes how PCI flash adapters … the cards you stick in stuff … how they communicate with the CPU, the applications and the operating system," explained Brian Garrett, vice president of ESG Labs in Milford, Mass.
Without the NVMe standard, each vendor's PCIe adapter requires its own driver, so PCIe flash card maintenance and configuration is a major hassle. The NVMe specification standardizes the host controller interface and provides support for multicore architectures, end-to-end data protection, encryption and security.
The NVMe 1.0 specification was announced in March 2011 and the 1.1 spec was released in November 2012. But according to Garrett, NVMe adoption is following a typical industry pattern because of standard development and OEM product lifecycles. "Once the spec is finalized, and the devices become available, we need to wait for the systems' OEMs to pick them up, qualify them, drop them into solutions and get them to work," he said.
The University of New Hampshire InterOperability Laboratory in Durham has posted a list of devices and platforms that have successfully completed interoperability tests. These include NVMe flash controllers from PMC-Sierra Inc., the Samsung XS1715 NVMe PCIe SSD and a Western Digital Technologies Inc. PCIe NVMe SSD. Expect to see many more NVMe-compatible devices available in 2014.
Chip and storage device vendors are developing 3-D flash memory vertical stacking technology to overcome the impending physical limitations and disadvantages of reducing flash's planar die size. The smaller the die, the less performant and reliable the flash memory is due to cell-to-cell interference.
Samsung Electronics Co. Ltd., which recently announced that it has begun mass production of its 3-D Vertical NAND (V-NAND) flash memory, says the idea behind 3-D stacking is that by placing cell layers on top of each other, write performance can be doubled and reliability increased by 10 times compared to current processes.
"If you look at the broader category of storage-class memory, 3-D NAND is the next evolutionary step along that path," Garrett commented. "It's going to happen."
Devices utilizing 3-D flash stacking will not be available as soon as NVMe-compatible devices because the technology isn't as advanced as NVMe. Garrett expects 3-D stacking devices to appear in 2014, with a bigger push likely in 2015. But he said that if flash manufacturers hit the physical "density wall" sooner than expected, 3-D stacking development will ramp up more quickly.
Another factor that may hasten 3-D stacking progress is that consumer devices will also benefit from the technology, so enterprise-device manufacturers won't be the only groups putting resources into its development.
Primary storage data deduplication
Ever since data deduplication became a staple for backup products, storage managers have wondered when the technology could be applied to primary storage.
But adding dedupe to existing primary storage systems proved far more difficult that it was on the backup side.
A decade or so later, primary dedupe is ready. Yes, we did prematurely declare it a hot data storage technology in 2011, but the stars are now aligned correctly for 2014. Several irreversible storage trends indicate primary dedupe is about to become a common feature.
The emergence of flash storage is one of those trends. Dedupe helps extend the usable capacity for expensive solid-state drives, while the speed of SSD makes inline dedupe work well enough to be viable in a production environment. The cloud is another dedupe driver because data has to be shrunk if it's to be moved efficiently over the network to public clouds. Virtualization also plays a role in pushing dedupe because virtual machines (VMs) tend to have a high level of redundancy and dedupe.
In 2013, we saw the big vendors offer primary deduplication. Dell finally got technology it acquired from Ocarina Networks in 2010 to work with its Compellent storage arrays. Hitachi Data Systems (HDS) added dedupe code from an OEM deal with Permabit to its Hitachi NAS (HNAS), and EMC added fixed-block post-process primary deduplication to its overhauled VNX unified storage platform.
Now users are looking for primary dedupe when they buy storage.
Jeremy DeHart, IT manager at law firm Hedrick Gardner Kincheloe & Garofalo LLP, said built-in deduplication was an important part of his decision-making when he bought Tegile Zebi hybrid flash arrays. DeHart's firm replicates data between two identical arrays from its headquarters in Charlotte, N.C., and an office in another part of the state. Having built-in dedupe extends his usable capacity and makes the replication process faster.
"Deduplication and replication were huge for us, along with the flash in the system," DeHart said. "Dedupe was something I had to have because everything is virtualized for us. It also gives you the ability to replicate that data a lot quicker."
EMC customer Ed Ricks, chief information officer at Beaufort Memorial Hospital in South Carolina, said the combination of dedupe and flash made the new VNX arrays more interesting for him.
"I'm intrigued by it," he said. "Not only do you get flash, but they also put dedupe in it. You can buy a 7 TB entry-level array model and you might get 25 TB to 35 TB usable out of it, plus take advantage of flash speed."
What makes dedupe more likely to catch on now is that it's usually free -- built into flash storage, cloud gateways and operating systems of the top-selling arrays.
"Data deduplication is now mainstream and should be treated that way," said Arun Taneja, consulting analyst at Hopkinton, Mass.-based Taneja Group. "Why go back to 20th century technologies now?"
It's no secret that storage complexity and management are two of the biggest challenges confronting administrators of virtual server environments. As user frustration with traditional storage systems grows, mounting interest in storage built for virtual machines has spurred a breed of hyper-converged storage systems -- all-in-one products evolving from converged systems that include storage, networking and compute, but also pack in a hypervisor.
Currently, hyper-converged systems are only offered by a few startups. Nutanix was the first with its Complete Cluster (now called the Virtual Computing Platform), SimpliVity debuted its OmniCube last year and Scale Computing launched its HC3 shortly after. But VMware is getting into the game with its Virtual SAN (vSAN) entry, which is in beta but will likely serve to heighten interest in and attention to hyper-converged storage.
Because virtual environments typically require servers and storage to be managed separately, infrastructures can become complicated and finding the root cause of performance-crippling bottlenecks can be frustrating. "When the administrator actually has to go and figure out what's happening, it's very difficult," explained Jeff Byrne, a senior analyst at Taneja Group. "You're trying to map this virtualization construct on top of a traditional storage construct, and the two just don't blend very well."
Hyper-converged systems deal with the issue in a couple of ways: Provisioning storage can be done directly through a management portal, so the need to map LUNs and volumes is eliminated. And by bringing all the components of the environment together, management is done behind a single pane of glass and pinpointing problems is streamlined.
Simplified infrastructure and management is the driving force behind the popularity of hyper-converged options with small to medium-sized businesses. The SimpliVity OmniCube, for example, comes complete with servers, software, hard drives and SSDs, and increasing capacity is as simple as inserting an additional unit. And the VMware vSAN allows customers to create pools from existing hard drives and SSDs while incorporating management capabilities into the hypervisor.
One shortcoming of hyper-converged products is the lack of variation in hypervisor support. With VMware as the most widely adopted hypervisor, it makes sense that it would be the first virtualization platform that hyper-converged vendors would turn to.
Nutanix, Scale Computing and SimpliVity all support VMware, while Nutanix and Scale Computing also support the open source KVM platform. But according to industry analysts, support for additional hypervisors needs to be added for hyper-converged systems to thrive.
"Because more companies run multiple hypervisors than not, I think [supporting more than VMware] is going to be critical to advancements," said Terri McClure, a senior analyst at ESG.
All three vendors have expressed interest in adding support for additional hypervisors in upcoming versions of their products, but no definite plans have been confirmed.
Interest in all-in-one backup appliances has grown in recent years, and the product category is poised to become a significant part of the data protection market.
All-in-one backup appliances, which combine hardware, software, media server and target with "drop and go implementation," offer two key advantages: initial implementation is much easier than products that require an additional backup app to use, and you get ongoing, single-vendor support for both backup software and hardware.
Backup app vendor Symantec has had demonstrable success with its Backup Exec- and NetBackup-based appliances. Other vendors, including Asigra, StorServer, Unitrends and others, now also offer pre-integrated, turnkey backup solutions.
"For some organizations that are growing in size and need more capabilities for backup or data protection than what they had in the past, an appliance makes sense," said Greg Schulz founder of Stillwater, Minn.-based analyst firm StorageIO. "And, for other organizations, instead of essentially assembling the hardware, software and networking pieces, creating their own backup server and appliances, there is the opportunity to do more with the same resources [people, time, budgets] by leveraging an appliance."
Rachel Dines, a senior analyst at Forrester Research in Cambridge, Mass., sees backup appliances playing an essential role in coping with nonstop capacity growth. "With the volume of data we're seeing right now in backup, secondary and tertiary storage is one of the fastest-growing areas of storage right now, even faster than file storage, according to our data," Dines said. "Backup and archive is growing even faster than file storage. With the explosion of data plus the need for very quick recoveries, organizations are looking for something quick and easy to deploy and straightforward to manage."
In the coming year we see the backup appliance market continuing to grow -- and branching into virtual realms -- for these three key reasons:
- Remote office/branch office is a fast growing market. While "companies large and small, in industries from government to financial services to manufacturing to retail" are using backup appliances, Dines said, the growing market appears to be remote offices/branch offices.
- Software-defined data center will impact backup appliances. One potential hindrance to continued growth of the backup appliance trend in the coming year is "the advancement of scale and reliability of software-only solutions," Dines said. "The biggest products on the horizon may very well be along the lines of the software-defined data center concept. In 2013, we saw some announcements of disk libraries offered as virtual appliances from HP StoreOnce and Quantum. In 2014, we may see more virtual appliance offerings from hardware vendors."
- VM integration should play a role. StorageIO's Schulz agreed and said that "virtual machine integration, along with additional application support, should be a given roadmap, either adding new and more apps, or extending current capabilities including rapid restore of a virtual machine from a backup, snapshot or where it's protected."
Open source OpenStack storage continues to attract attention and is gaining adoption as more commercial vendors back it, more supported distributions become available and more case studies surface as proof points.
OpenStack supports object storage and block storage as part of its open source cloud operating system that also aims to control pools of compute and networking resources. Rackspace Hosting originally developed the OpenStack technology and co-founded, with NASA, the community that maintains the open source software.
Vendors contributing to the OpenStack Object Storage project, code-named Swift, include Hewlett-Packard (HP), IBM, Rackspace, Red Hat and SwiftStack. HP, IBM and Red Hat also work on the OpenStack Block Storage project, code-named Cinder, as do other vendors, such as Intel, Mirantis, SolidFire and SUSE.
OpenStack Block Storage provides software to provision and manage persistent block-based storage and deliver it as an on-demand service. OpenStack Object Storage facilitates the storage of petabytes of static data on commodity servers and ensures data replication across the server cluster. It is best suited to backups, archives and content repositories.
IT shops hesitant to use unsupported open source software can opt for commercial variants, available from Canonical, Cloudscaling, HP, Piston Cloud Computing, Rackspace, Red Hat, StackOps, SUSE and SwiftStack.
"OpenStack Swift is not a ready-to-deploy system that you can just download and install and then you're up and running," said Ashish Nadkarni, a storage systems research director at International Data Corp. in Framingham, Mass. "It's still very new and requires a fair bit of customization, programming and tweaking. Some people have the resources to do it in-house, and the rest go with the commercial variants."
With OpenStack Block Storage, the physical hard disk or SSDs can be located within or directly attached to Cinder server nodes, or they can be part of external storage systems that third-party vendors have integrated. Available plug-ins include open source Ceph RBD and Red Hat's GlusterFS, and select systems from Coraid, EMC, HP, Huawei, IBM, Mellanox, Microsoft (Windows Server 2012), NetApp, Nexenta, Scality, SolidFire and Zadara.
Nadkarni said OpenStack Block Storage can be viewed as next-generation, hardware-agnostic storage virtualization, providing an abstraction layer to pool storage resources and permit the integration of third-party arrays.
"The core principle of OpenStack is to use commodity-based storage to create a full-service platform," he said. "If you start using commercial platforms with OpenStack, what are you getting? You're not getting much."
The San Diego Supercomputer Center (SDSC) at the University of California is investigating Cinder and Ceph to provide persistent block storage for its OpenStack compute resources. Matthew Kullberg, SDSC's technical project and services manager, said the open source options could provide greater flexibility and expansion capabilities, in support of applications such as databases, than SDSC's current block storage does.
SDSC has used OpenStack Swift since 2011 for a private cloud storage service that replaced its tape-based data archives. Object storage options were sparse at the time, and SDSC chose OpenStack to hold down costs, eliminate vendor lock-in and tap into the large, accessible development community. Kullberg said the team would make the same decision today.
"OpenStack has proven to be a great resource for researchers at SDSC and the university community," he said.
It might be tempting to dismiss cloud-integrated storage (CIS) as just another marketing term, but if you did, you'd be overlooking the best fit for cloud storage in enterprise environments. The reality is that it's become part of the cloud lexicon to describe cloud storage that's used either in a hybrid mode, tiering fashion or any other way to expand on-premises capacity as seamlessly as possible.
The key technologies behind CIS are gaining acceptance. Gateways have evolved into cloud controllers that play a pivotal role in extending storage capacity beyond data centers. Hybrid appliances have become standard tools in many data centers. Interest in software-defined storage appliances and object storage is also growing.
"Hybrid storage is an escalating use case," said James Bagley, senior analyst at Austin, Texas-based Storage Strategies Now. "We talk about where the company has its own infrastructure and uses cloud via a policy for doing archiving and disaster recovery. Not a lot of companies are using a cloud tier as frontline storage unless they are operating entirely in the cloud. In those cases, all their applications are operating in the cloud."
Nicos Vekiarides, CEO at TwinStrata, said CIS is synonymous with hybrid storage, where cloud storage is combined with local storage in cache. It's also used for separate storage tiers, with some local and some in the cloud.
"The cloud can be a second data center without all the capital costs," Vekiarides said. "It's a cost-effective way for off-site data protection and data recovery."
The most common use cases are for capacity expansion or cloud-based disaster recovery, Vekiarides said.
"We work with companies that produce data growing at 40% to 50% per year," he added. "This is difficult to put that all in local storage. By storing it in [a controller] you get a local copy and it's protected with a snapshot in the cloud."
Storage Strategies Now's Bagley said you don't necessarily need an appliance as an onramp for CIS but you do need more applications that work with clouds such as Amazon S3 or Microsoft Azure. There are also vendors like StorageCraft, which targets smaller companies with its ShadowProtect Cloud Services product that backs up data to its own cloud.
"Short of having apps specific to backup and archiving, that's where an appliance is handy," Bagley said.
Contributors to this feature included Rich Castagna, Todd Erickson, Ed Hannan, Sonia Lelii, Dave Raffo, Carol Sliwa and Sarah Wilson.