vege - Fotolia

6 key AI data storage questions answered

Want to know what to watch for when planning storage for AI workloads? Find out the key considerations and challenges for dealing with the complexities of AI.

Research firm IDC projected that, by 2023, global spending on AI technology will hit nearly $98 billion, up from $37.5 billion in 2019. That increase represents a compound annual growth rate of nearly 30%. All those new, complex AI applications won't be deployed in a vacuum. A range of IT infrastructure, including storage tailored for AI, must support and process these new workloads.

AI, and particularly machine and deep learning, applications require vast amounts of data that gets sent to CPUs or GPUs for processing and analysis in near real time. And then much of that data must be stored for possible future use. AI applications and technology will be among the top factors affecting infrastructure decisions over the next several years. For storage, that means enterprises must understand the data they're processing, find ways to get their storage media closer to compute technology and enhance AI data storage performance to match that of the processors.

What follows are six questions we answered in various SearchStorage articles related to how AI workloads are changing enterprise storage infrastructure.

1. What should be considered when planning AI data storage workloads?

Putting together storage for AI applications is no easy matter. There are several issues to consider and details to get right. Consultant and tech writer Robert Sheldon has a list of eight such factors:

  • Understand your various workloads' storage requirements.
  • Know your capacity and scalability requirements.
  • Find out how long you'll need to hold onto the data and how it will be accessed.
  • Factor in the throughput and I/O rates you'll need.
  • Consider location -- data near compute will minimize latency.
  • Assess the best type of storage to use -- block, file or object.
  • Use intelligent and software-defined storage to enhance performance.
  • Ensure all systems involved are tightly integrated.
AI spending

2. What challenges does AI data storage bring?

There are two distinct challenges related to building storage for AI applications, according to IT industry veteran Chris Evans. On the one hand, it won't always be clear at the outset of an AI or machine learning project what data will be useful. As a result, long-term archive storage will be required where data can be retained and accessed when needed to support the learning effort.

The other major storage challenge lies in ensuring sufficient high-performance storage is available for the active data that needs processing. Vendors are combining fast storage with AI and machine learning capabilities to meet this need. Evans outlines the challenges of building a storage platform that balances the storage needs of AI workloads. Packaged storage tuned for AI products are part of that discussion and can be attractive because they offer a specific level of performance.

3. What challenges does object storage pose?

On paper, the high-node count storage clusters of object storage systems should be ideal for the demands of the large, unstructured data workloads generated by AI and machine learning applications. In addition, most AI and machine learning frameworks communicate with storage via the Amazon S3 protocol, as do most object storage systems. However, in reality, other factors come into play that can interfere with object storage's effectiveness when it comes to AI data storage.

Metadata, in particular, can be a problem, overwhelming dedicated controllers and negatively affecting the performance of SSDs and HDDs. Contributor George Crump examines metadata and other issues related to cluster communications, internode networking and the protocol emulation required for IoT devices that aren't native S3 and use NFS.

4. What role will flash play in AI data storage workloads?

AI applications have high storage capacity demands that can easily start in the terabyte range and scale into hundreds of petabytes.

To get the information they need, AI and machine learning applications process large amounts of data. These applications usually rely on a cluster of compute nodes where at least some of the nodes use more expensive GPUs to deliver the performance of up to 100 CPUs. The storage infrastructure must ensure data is continuously provided to these GPUs so they're always in use. These systems must be able to store and process the millions, and even billions, of data files typically generated by sensors or IoT devices.

As a result, AI applications have high storage-capacity demands that can easily start in the terabyte range and scale into hundreds of petabytes. Crump looks at what the demands of AI workloads on storage means for the storage media used. Because of these high demands and the relatively high cost of flash, he expects AI storage to rely less on flash and more on a combination of RAM, hard disks and tape.

5. How does NVMe help meet AI data storage's needs?

NVMe provides the large bandwidth and lower latency that AI and machine learning applications need to maximize performance. It's a natural fit with the high-performance, scale-out storage and GPU-based compute that AI platforms use and will help eliminate I/O bottlenecks and provide scalability.

IT journalist John Edwards tracks how NVMe is replacing traditional storage in AI environments and why that makes sense in this performance-driven world.

6. Where does storage class memory fit in?

As a new tier in the memory/storage hierarchy, storage class memory (SCM) sits between SSDs and dynamic RAM (DRAM). These devices connect directly to the server memory and, like DRAM, are byte-addressable. But, like NAND-based devices, SCM devices are persistent and they support block-level access. SCM devices usually sit between memory and storage, letting applications access large data sets through system memory without forcing enterprises to pay the high price of DRAM.

AI applications are among the use cases where emerging SCM technology makes sense, according to Sheldon. SCM devices provide the low latency, high durability and optimal performance AI workloads require.

Next Steps

How to optimize storage for AI, machine learning and deep learning

How AI and data analytics are driving composable infrastructure adoption in datacentres

How to deploy NVMe flash storage for artificial intelligence

Dig Deeper on Storage system and application software

Disaster Recovery
Data Backup
Data Center
Sustainability and ESG
Close