Computational storage is an information technology (IT) architecture in which data is processed at the storage device level to reduce the amount of data that has to move between the storage plane and the compute plane. The lack of movement facilitates real-time data analysis and improves performance by reducing input/output bottlenecks.
In many respects, a computational storage device may look just like every other solid state drive (SSD). Some products have a large number of NAND flash memory devices that actually store the data, a controller that manages writing the data to the flash devices and random access memory (RAM) to provide a read/write buffer. What is unique about computational storage devices is the inclusion of one or more multi-core processors. These processors can be used to perform many functions, from indexing data as it enters the storage device to searching the contents for specific entries to providing support for sophisticated artificial intelligence (AI) programs.
Computational storage products and services are starting to appear on the market and the ability to integrate them is still in the early stages of development. However, with the growing need to store and analyze data in real-time, the market is expected to grow very quickly. As of this writing, computational storage can be implemented by using one of two key products currently being defined by the Storage Networking Industry Association (SNIA) Computational Storage Technical Working Group (TWG):
- Computational Storage Drive (CSD): a device that provides compute services in the storage system and supports persistent data storage -- including NAND flash or other non-volatile memory.
- Computational Storage Processor (CSP): a device that provides compute services in the storage system, but cannot store data on any form of persistent storage.
The ability to provide compute services at the device level was not truly available until the adoption rate of SSDs was in place, because traditional storage devices like hard disk drives (HDD) and tape drives are not able to process the data locally like an intelligent computation storage is capable of doing.
Why computational storage is important
Traditionally, there has always been a mismatch between storage capacity and the amount of memory that the central processing unit (CPU) uses to analyze data. This mismatch requires that all stored data be moved in phases from one location to another for analysis, which in turn prevents data analysis from being real-time. Using these products, the host platform has the ability to deliver application performance and results from the storage system without requiring all data to be exported from the storage devices into memory for analysis.
Computational Storage is important in modern architectures today due to the continued growth of the raw data being collected from sensors and actuators in the Internet of Things (IoT). Traditionally, there has always been a mismatch of storage capacity (TB of storage data) and the amount of memory (GB of DRAM) that the CPU used to analyze data. This mismatch requires that all stored data be moved from one location of large size in products like SSDs to a smaller location in memory or DRAM in phases. Due to this phased approach, the time required to complete analysis of data was no longer real-time. By providing compute services at the storage device level, the analysis of the raw data can be completed in place and the amount of data moved to memory is mitigated into a manageable and easy to process sub-set of the data.
The ability to scale storage without having to be concerned about latency is an important consideration for hyperscale computing data centers that are trying to manage all the stored data from companies using public cloud services, to the dedicated data centers for social media platforms storing images and updates, as well as data centers that are managing media and entertainment needs for data streaming. Additional use cases for computational storage include edge computing or Industrial IoT environments in which too much latency can be literally cause an accident, and the need to conserve space and power are even more important.