Paul Crocetti, Garry Kranz, Sonia Lelii, Dave Raffo, Carol Sliwa and Erin Sullivan also contributed to this st...
As the sun sets on 2017 -- and rises over 2018 -- we once again present the technologies and trends in data storage that we think will shine brightest and have the most sway over data centers in the coming year. It's time for Hot Techs 2018!
For the 15th year, our list shies away from anything too futuristic and impractical. Instead, we focus on newer storage tech that has demonstrated its usefulness and practicality. You'll only find technologies already available for purchase and deployment here.
So grab a front-row seat and get comfortable as we look at the data storage technology trends that will have the biggest impact on storage shops -- and the professionals that run them -- in the year ahead.
Predictive storage analytics
Predictive storage analytics has morphed from being a specialized feature to a red-hot storage technology. Fueling its rise is the prominence of all-flash arrays and growing demand for real-time intelligence about storage capacity and performance.
Predictive storage analytics transcends traditional hierarchical storage management or resource monitoring. The goal is to harness and parlay vast amounts of data into operational analytics to guide strategic decision-making.
"With increasing hybrid and hyper-converged infrastructures, storage is no longer a separate slice of the data center technology stack that you can intelligently manage or analyze in isolation. Looking at larger slices of the stack requires the use of more sophisticated analytical approaches over bigger data," said Mike Matchett, a senior analyst at Taneja Group.
Matchett noted how call-home support has evolved from its origins in batch-oriented processing to a tool for continuous remote analysis and measurement. "Call-home support is capable of dealing with hybrid complexity, including cloud and on-premises storage, and [is] tipping toward proactive automation by applying predictive intelligence algorithms," he said.
Predictive analytics lets storage and network-monitoring vendors continuously capture millions of data points in the cloud from customer arrays deployed in the field. They correlate the storage metrics to monitor the behavior of virtual storage running on physical targets.
Nimble Storage is widely regarded as the pioneer of predictive analytics, having launched its InfoSight cloud-based analytics as a native service on its all-flash arrays. The InfoSight software was a key driver behind Hewlett Packard Enterprise's decision to spend $1.2 billion to acquire Nimble Storage earlier this year.
Other storage vendors are getting into analytics, adding telemetry to track capacity, data protection, performance and system health, and helping make predictive analytics one of the trends in data storage to watch in 2018.
Typically, predictive analytics can pinpoint potential problems, such as defective cables, drives and network cards. If hardware issues are detected, the software sends alerts and recommends troubleshooting. An at-a-glance console provides an integrated view across the infrastructure stack, letting customers apply recommendations with a single click.
Aside from monitoring hardware, array-based analytics tools have matured to provide cache, CPU and storage-sizing recommendations based on preselected policies.
The importance of predictive storage analytics won't wane anytime soon. Big data deployments are no longer a curiosity, having matured to the point that companies in almost every industry could deploy a DevOps model. The ability to rapidly compute massive data sets at the network edge is credited with helping organizations wring more business value from flash storage infrastructure.
In addition, Matchett said, the ability to analyze vast sums of big data from their entire customer base lets a storage vendor apply artificial intelligence and machine learning to help individual enterprises predict their storage needs hour by hour.
Server-side flash, including nonvolatile memory express (NVMe) SSDs, and persistent storage-class memory will also have an impact on predictive analytics.
"Traditional all-flash storage arrays are probably a dying category," Matchett said. "The future will be rife with large pools of multiple classes of memory and storage spread across distributed, hybrid networks that need to be intelligently managed as programmable infrastructure [to deliver] elastic storage services." All of this storage, Matchett added, will require advanced analytics to help organize, provision, orchestrate and dynamically optimize performance; identify and rectify issues; and squeeze the most out of these more expensive resources.
Ransomware has made big news in the last couple of years. While global attacks -- such as WannaCry in May and Petya and NotPetya in June -- garner the most coverage, smaller-scale ransomware can be just as crippling for victims. While many organizations aren't paying the ransom, downtime can be even worse than making a payment. Fortunately, ransomware protection from backup and recovery vendors is now a hot technology and one of the trends in data storage 2018 to watch.
Ransomware is malware that encrypts data and demands payment for the decryption key, often getting into a system through an infected email attachment or website. Statistics vary dramatically, but payments are often hundreds of dollars. Attackers have targeted businesses of all sizes, from SMBs to the enterprise. As a result, every organization must be thinking about ransomware protection and how to recover from an attack.
Protection varies in strength and scope. Comprehensive backup is one way to fight ransomware. Companies should also be looking at data protection vendors that offer integration with a malware detector, said Jason Buffington, principal analyst at Enterprise Strategy Group. Simply saying a product can help an organization recover from a ransomware attack "is no different from saying you can recover from a forest fire or a server failure," Buffington said.
Several backup and recovery vendors have released products with ransomware protection features in the past year. Here are some examples:
- Acronis software uses machine learning to prevent ransomware viruses from corrupting data. It attempts to detect suspicious application behavior before files are corrupted.
- Druva built ransomware monitoring and detection tools into its InSync endpoint data protection software.
- The Quorum OnQ Ransomware Edition, an appliance specifically designed to recover from an attack, takes snapshots of servers and provides server-level recovery.
- Unitrends physical and virtual appliances use predictive analytics to determine the probability that ransomware is operating on a server, workstation or desktop. The vendor then alerts customers when it detects ransomware, so they can immediately restore from the last legitimate recovery point.
In addition, though it may not be a hot technology in and of itself, tape backup is a secure way to protect against ransomware because offline data cannot be infected.
Education is another important element of ransomware protection. Backup and recovery vendors have responded by taking that approach as well:
- Ransomware Watch, led by Arcserve, and Carbonite's Fight Ransomware are websites that gather news and advice content.
- As part of its Ransomware Antidote Program, Infrascale features an online risk analysis quiz to help businesses understand their vulnerability.
Ransomware detection and protection is likely top of mind for many backup and recovery vendors that don't offer it yet. Expect to see more ransomware-specific capabilities added to products in the months ahead.
Converged secondary storage
The expansion of hyper-convergence into secondary storage is a natural next step for the technology. Secondary storage's rise started gradually, with a handful of vendors taking notice. Now we expect to see converged secondary storage taking up more space and getting more buzz in the coming year.
Many organizations are putting greater emphasis on secondary storage to reclaim much-needed primary storage capacity. Secondary storage frees up primary storage, while leaving the data more accessible than archive storage. It also lets organizations continue to gain value from older data or data that isn't mission-critical.
Converged secondary storage consolidates all secondary storage backup data, objects and files onto one platform. The commonly cited benefits of hyper-converged storage -- efficiency, ease of use, scalability and productivity -- apply to converged secondary storage as well. In addition, in order to make secondary storage data more available, vendors have begun offering converged secondary storage platforms that are efficient, scalable and can be integrated with the public cloud.
Converged storage has enabled newcomers to crack the crowded data protection and data management markets. For instance, Cohesity launched its DataPlatform in 2015, focusing on secondary storage. Cohesity has expanded its offerings to keep up with the latest developments in secondary storage requirements.
In 2016, Cohesity released DataPlatform Cloud Edition, capitalizing on the move to establish the cloud as a secondary storage tier. Cohesity's Orion release of its DataProtect software, rolled out earlier this year, added features such as NAS, hypervisor support and global deduplication for Amazon Simple Storage Service (S3).
Arun Taneja, Taneja Group founder, called Cohesity DataPlatform Virtual Edition "one platform for all secondary storage applications. That's what is unique about its solution. Cohesity has applied hyper-converged principles to the secondary storage side, across all types of secondary workloads."
Although most of Cohesity's competitors take a more traditional approach to data protection, other vendors are getting into the market. For instance, Rubrik takes a similar approach to Cohesity's with its Cloud Data Management Platform, a scale-out secondary storage cluster sold as a physical appliance with bundled software. Startup Igneous Systems consolidates backup and archive on a single secondary storage tier with the Igneous Hybrid Storage Cloud.
Commvault also began making secondary storage a priority in 2017, and in October, it released Commvault HyperScale. HyperScale is Commvault's first scale-out, integrated hardware appliance for data protection, a clear move to compete with the likes of Cohesity and Rubrik. Is a converged secondary storage product next?
With more vendors getting in on the action and more emphasis than ever on secondary storage, converged secondary storage should have a big year as one of the key trends in data storage for 2018.
Multi-cloud storage is one of the latest amorphous technology terms to capture the imaginations of industry experts. It's poised to become one of the hot technology trends in 2018 as more enterprises that have adopted the cloud -- whether in a hybrid or pure public configuration -- are demanding the cloud provide true IT services capabilities.
Jeff Kato, senior storage analyst at Taneja Group, said non-cloud spending is expected to be less than half the entire infrastructure market, and Amazon Web Services (AWS) continues to grow at more than 40% year over year with annualized revenue of about $15 billion. Microsoft boasts similar revenue when its Office 365 software as a service is included.
The benefits of multi-cloud storage are hard to ignore. There's data portability among heterogeneous clouds, easier lifting and shifting of applications among multiple cloud environments, better data availability and disaster recovery, and the ability to bridge data services between private and public clouds. Also, you can set enterprise data services more consistently and colocate them with applications and compute resources.
"This area is fast moving," Kato said.
Taneja Group defines multi-cloud storage as providing primary data services that operate simultaneously across multiple, heterogeneous cloud environments and a location where compute and applications can be colocated with these data services. The primary data storage services must support at least one large public cloud vendor, such as AWS, Google Cloud Platform or Microsoft Azure, and any type of cloud offering built on vendor lock-in is going to be limited in its ability to provide true multi-cloud storage.
However, multi-cloud storage still has its share of challenges. Moving data in and out of clouds is more complicated than moving it across on-premises systems, and managing data stored in different clouds requires a new approach.
Several vendors already offer a genuine multi-cloud primary storage concept based on software-defined storage (SDS). These include Hedvig, Qumulo, Scality, SoftNAS and SwiftStack. Scality's multi-cloud software is built on object storage with Amazon S3 compatibility, while also offering some file capabilities. SDS offerings from SoftNAS and Qumulo are focused on cloud file, while Hedvig provides block, file and object storage. SwiftStack is object and file storage.
"A lot more companies are moving into this area," Kato said. "There's no reason other software-defined companies could not go to the cloud directly. Backup vendors are starting to put their technology directly into the cloud. I would also look at the hyper-converged [vendors]. They're next. Scale Computing announced a partnership with Google to put software in the public cloud."
Based on its own research, Taneja Group found that vendor lock-in is one of the main concerns storage customers have with the cloud, which is the same problem they faced during the days when hardware reigned. A true heterogeneous, multi-cloud offering means applications and data can run across different public cloud environments, such as AWS and Azure, or between a public and private cloud.
"So if data is created in Amazon, it can be made accessible in another public or private cloud," Kato said. "You have to be able to move data between heterogeneous clouds, whether they're public or private."
There are vendors that claim to offer multi-cloud, but they really aren't. They're tiering to the cloud or offering a cloud service from data centers built near each other, with the underlying technology coming from a single vendor. An example of this type of homogenous (aka not multi-cloud) cloud offering would be one where the cloud is AWS with VMware technology on both ends.
NVMe over Fabrics
Performance-boosting, latency-lowering nonvolatile memory express is already one of the hot technology trends in SSDs that use a host computer's PCI Express bus. Moving into 2018, the revenue stream for NVMe over Fabrics (NVMe-oF) should start to grow, making it one of the significant trends in data storage. Significant deployments are expected follow in 2019 and beyond, according to industry analysts.
The NVMe-oF 1.0 specification that emerged in June 2016 is based largely on the NVMe protocol designed for use over a computer's PCI Express bus. NVMe-oF aims to extend the benefits of NVMe over a network to accelerate data transfer between host computers and target storage systems. Initial NVMe-oF transport options included remote direct memory access (RDMA) over converged Ethernet (RoCE) and Fibre Channel (NVMe-FC).
Mike Heumann, a managing partner at G2M Inc., a firm that researches the NVMe market, said NVMe-oF struggled to get a foothold in 2016, and the only area where it had much presence was with appliances based on Microsoft's Storage Spaces Direct using NVMe over RoCE. This year, all-flash array vendors such as Kaminario, Pure Storage and Western Digital's Tegile have moved to use NVMe-oF as a back-end fabric, he said.
Although NVMe-oF host-to-target connectivity remains a work in progress, Heumann expects adoption to improve as potential deployment hurdles ease and major vendors jump in. He noted, for instance, that array vendors are starting to pick up NVMe-oF "accelerated" adapters to offload CPU-intensive work. Another important development in the works through NVM Express Inc. is a new NVMe-TCP transport to enable the use of NVMe-oF over existing IP networks.
"There is a lot of expectation with a lot of people that, that will make it much easier to deploy NVMe over Fabrics in the Ethernet world," Heumann noted. He said the TCP option would open up another set of users that might have been reluctant to tackle the perceived deployment hurdles around NVMe over RoCE.
G2M predicted the NVMe market will hit $60 billion by 2021, with revenue from SSDs; adapters; enterprise storage arrays and appliances; and enterprise servers, including some loaded with SDS designed for use with NVMe. G2M's research projected most enterprise servers will be NVMe-enabled by 2019, and more than 70% of all-flash arrays will be NVMe-based by 2020. Shipments of NVMe-oF adapters will surpass 1.5 million units by 2021, with 10% of them "accelerated," according to G2M.
NVMe-oF drivers are still in the works for major operating systems. Given that, startups such as Apeiron Data Systems, E8 Storage and Pavilion Data Systems have often used their own adapters and drivers to provide NVMe-oF capabilities, said Eric Burgener, a research director at IDC.
The main use case for early NVMe-oF-based products has been real-time big data analytics applications, Burgener said. IDC predicted that 60% to 70% of Fortune 2000 organizations will have at least one real-time, big data analytics workload by 2020. Certain high-end databases requiring low latency could also generate interest in NVMe-oF, Burgener added, as could vendors pushing denser workload consolidation.
"NVMe over Fabrics will likely see initial growth over Fibre Channel, replicating the kinds of functionality we currently see in data centers," said J Michel Metz, a Cisco Systems research and development engineer who works with NVMe technology. "As people become more familiar with the technology, we will see more Ethernet-based options implemented."
Data storage technology trends: The future of predictive analytics
Find out what's next for NVMe
A look back at hot data storage technology trends for 2017
- Data Management Strategies for the CIO –SearchDataCenter.com
- Data Protection Strategies in the Era of Flash Storage –Rubrik
- Preparing a database strategy for Big Data –SearchDataManagement
- Cloud Storage for Primary or Nearline Data –SearchStorage.com