Published: 11 Feb 2019
Once again, we put on our prognostication glasses to conjure up what we believe will be the hottest data storage technology trends over the next 12 months. Welcome to Hot Techs 2019!
Our Hot Techs focus is on newer storage technologies already available for purchase and deployment. So, for the 16th year running, to be considered one of the hot trends, a technology must have already proven its usefulness and practicality. As with the previous 15 years, nothing impractical or too futuristic makes our hot list. These are technologies storage pros can use here and now.
Sit back, relax and feel free to agree or disagree as we, the editors and writers at Storage magazine and SearchStorage, lay out the latest trends. This year's list includes cheaper and denser flash, NVMe-oF, AI and machine learning, multi-cloud data management and composable infrastructure. Read on and find out why these made the grade for hot data storage technology trends 2019.
Denser and cheaper flash
During 2019, a variety of data storage technology trends should work in the consumer's favor to make enterprise flash storage drives cheaper and denser than ever before.
For starters, the NAND flash that has curbed price declines during the past two years has reversed direction into an oversupply situation. As NAND prices plummet, SSD costs could fall in tandem, since flash chips represent the bulk of the cost of an SSD.
Greg Wong, founder and analyst at Forward Insights, said buyers should expect to see price declines of 30% or more for both the NAND chips and SSDs next year.
Another possibility is manufacturers could sell higher-capacity SSDs for the same price they previously charged for lower-capacity drives, said Jim Handy, general director and semiconductor analyst at Objective Analysis. Handy said prices in early 2019 could be half of what they were at the start of 2018. Estimates vary among market research firms, however.
Trendfocus forecast the price per gigabyte of NAND will drop from 19.5 cents in 2018 to 13.7 cents next year, 11.2 cents in 2020, 8.8 cents in 2021 and 7 cents in 2022, according to John Kim, a vice president at the market research firm based in Cupertino, Calif.
"Lower NAND prices equate to lower SSD prices, and on the enterprise side where capacities are so high, it makes a significant difference," Kim said. "Hard disk drives have a clear advantage in that dollar-per-gigabyte metric. But SSD companies are closing the gap on that now."
The shift to denser 3D NAND technology is also helping to drive down the cost of flash. Manufacturers shifted from 32-layer to 64-layer 3D flash in 2018, and they will slowly start ramping to 96-layer in 2019 and beyond. Additional layers enable higher storage capacity in the same physical footprint as a standard SSD using older two-dimensional technology.
Wong predicted 64-layer will continue to dominate in 2019, as NAND manufacturers slow their migrations to 96-layer technology due to the oversupply. Some manufacturers already started promoting their 96-layer technology in 2018, though.
Yet another density-boosting trend on the NAND horizon is quadruple-level cell (QLC) flash chips that can store 4 bits of data per memory cell. Triple-level cell flash (TLC) 3D NAND will continue to factor into the vast majority of enterprise SSDs next year. But Wong predicted QLC will creep from less than 1% of the enterprise SSD market in 2018 to 4.1% in 2019. This will be driven by hyperscale users.
Gartner has predicted that about 20% of the total NAND flash bit output could be QLC technology by 2021, and the percentage could hit 60% by 2025.
QLC flash doesn't offer the write endurance often needed for high-transaction databases and other high-performance applications. But QLC 3D NAND offers a lower-cost flash alternative than TLC and could challenge cheaper HDDs used for read-intensive workloads in hyperscale environments.
Don Jeanette, a vice president focusing on NAND-SSDs at Trendfocus, said hyperscale deployments will also continue to fuel the enterprise PCIe SSD market and pressure the SATA- and SAS-based SSD market. Analysts expect PCIe SSDs using latency-lowering NVMe will be one of the major data storage technology trends in 2019.
AI and machine learning storage analytics
AI and machine learning have traditionally been used for ransomware detection in storage and backup, but the tech has also been used to generate intelligent, actionable recommendations. Analytics driven by AI and machine learning can help with operations such as predicting future storage usage; flagging inactive or infrequently used data for lower storage tiers; and identifying potential compliance risks, like personally identifiable information. This year, we predict AI- and machine learning-powered products designed to manage and analyze vast amounts of data will become more commonplace, especially as data and IT complexity continue to grow rapidly.
In 2018, vendors launched products using AI and machine learning to guide IT decision-making. Imanis Data Management Platform 4.0's SmartPolicies can generate optimized backup schedules based on the desired recovery point objective set by the user. Commvault planned to update its interface with embedded predictive analyses regarding storage consumption by the end of 2018. Igneous Systems announced DataDiscover and DataFlow, which index and categorize unstructured data and move it around intelligently. Array vendors including Pure Storage, NetApp and Dell EMC Isilon support Nvidia graphics cards to accelerate analytics for deep learning.
Edwin Yuen, an analyst at Enterprise Strategy Group, said now that AI and machine learning have matured and permeated into other areas of IT, people are starting to warm up to the idea of using them to tackle a fundamental storage challenge: the explosive growth and complexity of big data.
"In order to match up with the growing complexity of IT, it's not about adding a little more personnel or adding slightly more automation or tools," Yuen said. "People understand now that they need to make a quantum jump up. We're going from a bicycle to a car here."
Human bandwidth is the next bottleneck, according to Yuen. Right now, AI and machine learning storage analytics software generates recommendations for administrators, which they then have to approve. Yuen imagines the next step will take things further, skipping the approval process and removing the need for user input except under very anomalous circumstances.
"The stream of data has come to the point of, 'Can a human being really process that much data? Or even approve the processing of that much data?'" Yuen said.
AI and machine learning storage analytics will see the most use in automating day-one deployments, according to Yuen. This is because the technology is especially useful in optimizing storage, specifically when it comes to tiering. One of the major changes the adoption of AI and machine learning will bring in IT operations is the need to redefine parameters for storage tiers, thereby laying the groundwork for where algorithms deploy the data.
"Figuring out where everything goes based on the usage is going to get more complicated," Yuen said. "We're going to have to make sure storage environments are going to express the parameters that [they have]."
Many organizations think of storage tiers in a binary way, such as inactive data vs. active or higher-performance tiers vs. low. But Yuen believes if software automatically handles tiering intelligently, it opens the door for expanding beyond a two-tier system.
"It would make the most logical sense if storage can go to five or six different platforms, but that's complicated," Yuen said. "But if AI and [machine learning] can do that for you, wouldn't you want to take advantage of all those different options based on what the cost/performance metric is?"
Flash arrays with NVMe fabric capabilities are showing up in mainstream enterprises faster than experts predicted, even if broad adoption by data centers is several years away.
By 2021, solid-state arrays that use external NVMe-oF to connect to servers and SANs will generate about 30% of storage revenue, according to analyst firm Gartner. That would represent a significant jump above Gartner's 1% estimate for 2018.
IDC pegged the number even higher, claiming NVMe and NVMe-oF-connected flash arrays will account for roughly half of all external storage revenues by 2021.
NVMe-oF has promising potential to deliver extremely high bandwidth with ultra-low latency, making it one of the hot data storage technology trends of 2019. Plus, the ability to share flash storage systems among multiple server racks would be compelling to organizations with large Oracle Real Application Clusters or other database deployments.
"One of the differences between NVMe and SCSI is it allows you to address storage devices through direct memory access. The SSDs of old were basically PCIe-compatible that plugged into a host server. They didn't need a [host bus adapter] to talk to storage devices," said Eric Burgener, a research vice president at IDC.
NVMe-oF extends direct memory access capabilities over a switched fabric to create shared storage with latency as consistent as PCIe flash in a commodity server, Burgener said.
"NVMe over fabrics allows you to share extremely high-performance and costly NVMe storage across many more servers. Any server connected to the switched fabric has access to the storage. With PCIe SSDs in a server, it's only efficient for that server to access [the storage]," Burgener said.
The NVM Express group initiated NVMe fabric development in 2014, aiming to extend it to Ethernet, Fibre Channel (FC) and InfiniBand, among other technologies. Connectivity between NVMe-oF hosts and target device remains an ongoing project.
NVMe fabric transports are under development for FC-NVMe and various remote direct memory access protocol technologies, including InfiniBand, RDMA over Converged Ethernet and Internet Wide-Area RDMA Protocol.
In 2018, storage vendors started to catch up to the market, bringing a raft of NVMe products to market. Most new NVMe fabric-based rollouts were from legacy storage vendors -- most notably, Dell EMC, Hewlett Packard Enterprise (HPE) and NetApp -- although startups continue to jockey for position.
End-to-end NVMe arrays are generally defined as rack-scale flash -- meaning systems with an NVMe fabric extending from back-end storage directly to applications. Rack-scale systems are mostly the province of startups. For the most part, legacy vendors swapped out SAS-connected SSDs on their all-flash arrays with NVMe SSDs, which gives some performance improvement, but is largely considered more of a retrofit design.
Why do we think NVMe-oF flash is, or will be, hot? The answer is in how the technology accelerates data transfer between server hosts and target storage devices. Legacy SCSI command stacks never envisioned the rise of SSD. NVMe flash sidesteps delays created by traditional network hops, allowing an application to connect directly to underlying block-based storage via a PCIe interface.
The industry developed the NVMe protocol with flash in mind. It is optimized to efficiently govern flash, leading to a consensus that NVMe will eventually supersede Advance Host Controller-designed SATA SSDs. NVMe draws data closer to the processor, reducing I/O overhead while boosting IOPS and throughput.
"There are two levels of engagement with NVMe. No. 1 is just connecting with it and using it, but that doesn't give a performance increase unless you rearchitect your data path and the way you lay down data on the flash media," said Ron Nash, CEO of Pivot3, a vendor that implemented NVMe flash to its Acuity hyper-converged system. "There is a big difference in how you optimize laying down data on spinning disk versus flash media."
Due to the way NVMe handles queues, it allows multiple tenants to query the same flash device, which gives higher scalability than traditional SAS and SATA SSDs. The NVMe-oF specification marks the next phase of technological development.
Disaggregation is another advantage of NVMe-oF, enabling compute and storage to scale independently. This capability is increasingly important for allocating resources more efficiently for IoT, AI and other latency-sensitive workloads.
One thing to keep in mind is the hype cycle for NVMe. If they're using flash at all, most companies are using solid-state technology in a server or an all-flash array. Only a curious handful of IT shops deploy NVMe on a large scale, said Howard Marks, founder and chief scientist at DeepStorage.net.
"NVMe over fabrics is a high-speed solution for a specific problem, and that problem is probably [addressed] within a rack or couple of adjacent racks and a switch," Marks said.
"NVMe over fabrics doesn't pose a serious threat to legacy SCSI yet," Marks added. "If it does, people will then switch whole hog to NVMe, and, at that point, all this becomes important," making it one of the most important data storage technology trends today.
Multi-cloud data management
According to Enterprise Strategy Group research, 81% of companies use more than one cloud infrastructure provider, whether for IaaS or PaaS. And 51% use three or more.
"People are inherently multi-cloud now," Enterprise Strategy Group's Yuen said.
Organizations are using multiple clouds for different applications, rather than spreading one kind of workload across clouds, Yuen said.
Multi-cloud benefits include performance, data protection and ensured availability, said Steven Hill, an analyst at 451 Research.
"We contend that a key motivator may be cloud-specific applications and services that may only be cost-effective for data that is stored on that particular platform," Hill said.
Data management, though, can be challenging when it spans multiple clouds, Hill said.
"It's becoming increasingly important to establish policy-based data management that uses a common set of rules for data protection and access control regardless of location, but this is easier said than done because of the differences in semantics and format between cloud object stores," Hill added.
"Planning a multi-cloud data management strategy that supports uniform data management across any cloud platform from the start combines freedom of choice with the governance, security and protection needed to remain compliant with new and evolving data privacy regulations," he said.
To that end, Hill believes new laws like the European Union's GDPR and the California Consumer Privacy Act will create a sense of urgency for policy-based data management, no matter where the data lives.
One major issue in dealing with multiple clouds is to understand what each contains and the differences between them, according to Yuen. For example, each one has different terminology, service-level agreements and billing cycles.
"It's difficult to keep track of all those," Yuen said, but that's where the opportunity comes in for vendors.
There are many multi-cloud data management products in the market, several of which launched this year. Here's a sampling:
- Panzura expanded into multi-cloud data management with its Vizion.ai software as a service (SaaS) designed to search, analyze and control data on and off premises.
- SwiftStack 1space multi-cloud introduced a single namespace for its object- and file-based storage software to make it easier to access, migrate and search data across private and public clouds.
- Scality Zenko helps place, manage and search data across private and public clouds. Use cases include data capture and distribution from IoT devices.
- Startup NooBaa -- acquired by Red Hat in November 2018 -- provides the ability to migrate data between AWS and Microsoft Azure public clouds. It added a single namespace for a unified view of multiple data repositories that can span public and private clouds.
- Rubrik Polaris GPS, the first application on the vendor's new Polaris SaaS management platform, provides policy management and control of data stored on premises and across multiple clouds.
- Cohesity also added a SaaS application, Helios, that manages data under control of the vendor's DataPlatform software, whether on premises or in public clouds.
Yuen noted many companies use more than one vendor for multi-cloud data management, and they're OK with it.
"It isn't necessarily one-size-fits-all, but that's not a bad thing, potentially," Yuen said.
For multi-cloud data management tools to improve, Yuen said he thinks they will need more insight, from a performance point of view. Tools could offer recommendations for tiering and usage levels across cloud providers. That analytical element is part of the next step in multi-cloud data management.
Right now, Yuen said, "we're in step one."
Composable infrastructure may be one of the newest architectures in the IT administrator's arsenal, but it isn't brand new. Even though it is a few years old, and we even cited it as one of the data storage technology trends to keep an eye on last year (it didn't quite live up to the hype in 2018), more vendors are putting out composable infrastructure products than ever before.
Like its IT cousins, converged infrastructure and hyper-converged infrastructure (HCI), composable infrastructure takes physical IT resources such as compute and storage and virtualizes them, placing all of the now-virtualized capacity into shared pools. However, composable infrastructure goes one step further, allowing admins to create not only virtual machines from these pools, as in HCI, but even applying them to physical servers or containers.
Unlike HCI, which uses a hypervisor to manage the virtual resources, composable infrastructure uses APIs and management software to both recognize and aggregate all physical resources into the virtual pools and to provision -- or compose -- the end IT products.
Early entrants into the market, going back to 2014, were Cisco and HPE. Recently, newer storage and HCI companies have entered or expressed interest in entering the composable infrastructure market, however. In September 2018, Pivot3's Nash said the company was moving toward providing composable infrastructure.
In an interview with SearchConvergedInfrastructure at the time, Nash said, "The direction we're taking is to make the composable stuff into a software mechanism. That software layer will manipulate the platform underneath to [provision] the right type of units to provide the right service level at the right cost level and right security level."
In August 2018, storage array vendor Kaminario announced it was adding software it calls Kaminario Flex to its NVMe all-flash arrays to enable them to be used in composable infrastructure implementations. But early player HPE isn't sitting on its Synergy composable product, announcing in May 2018 that its purchase of Plexxi would allow it to add networking capabilities to the virtual pools of resources in its Synergy composable infrastructure platform.