Choosing storage for streaming large files in big data sets
A comprehensive collection of articles, videos and more, hand-picked by our editors
Digital video has long presented a challenge to the organizations that use it. It can be quite difficult to store a large collection of digital video, but what is an organization to do when business mandates also require streaming large video files, a common occurrence in today's big data environments?
A primary factor affecting the ability to stream big data is the storage subsystem. The mechanism storing the data has to be able to read it quickly enough to facilitate streaming. As such, high-performance storage must be considered essential.
One important factor to consider as you plan for video storage is that big data is getting even bigger. Just a few short years ago, 1080p video was considered a cutting-edge feature. Today, there are some cell phones that can record 1080p video.
The January 2013 Consumer Electronics Show debuted 4K video, which has roughly four times the resolution of 1080p video and a higher color depth. The result is a massive increase in video file storage requirements, both in terms of capacity and I/O speeds.
Organizations wishing to stream large video files, especially high-definition video, need to consider a number of factors when designing a storage architecture. Among these are cost and whether all the video data will be accessed at about the same rate or if newer video will be viewed more frequently than older video.
Solid-state drives (SSDs) are a good choice for organizations that need to stream video data because they deliver random read/write performance that is far superior to that of mechanical hard drives. But solid-state storage is expensive. For storage pros, the cost per gigabyte for SSDs is far higher than for mechanical hard drives. Conversely, the cost per I/O operation is usually lower for SSDs than for mechanical drives.
Given the high cost per gigabyte and the relatively low capacity of SSDs, organizations that need to stream video data may benefit from a multi-tiered storage architecture. This kind of architecture allows your most recent video to reside on solid-state storage, where it can be easily streamed, while older video can be moved to traditional storage.
Drilling into solid-state for big data streaming
Traditional storage is capable of streaming video data, but because mechanical devices are slower than SSDs, it's probably not the best choice for those environments in which a large number of viewers will stream a video. Traditional storage may be perfectly adequate for streaming older videos to a limited number of people.
If you decide to use solid-state storage to stream large video files, you need to consider the type of SSDs best suited to the task at hand. SSDs fall into two main categories: enterprise-class SSDs that use single-level cell memory, and consumer-class SSDs that use multi-level cell (MLC) memory.
Solid-state drives (SSDs) are a good choice for organizations that need to stream video data because they deliver random read/write performance that is far superior to that of mechanical hard drives.
Enterprise-class SSDs are far more expensive and offer a lower overall capacity than their consumer-class counterparts. The justification for using enterprise-class SSDs is that they deliver better performance and durability. In the case of video streaming, however, consumer-class SSDs might sometimes prove to be a viable option.
Enterprise-class SSDs are designed to last at least 10 times longer than consumer-class SSDs. However, NAND flash cells wear out as a result of write operations. Read operations have little or zero impact on longevity.
Enterprise-class SSDs also deliver better overall performance. Write performance is usually at least twice as fast as a consumer-class SSD, but read performance tends to be only slightly better (although some makes and models of SSDs perform better than others). This is especially true for linear, non-fragmented data.
Consumer-class SSDs can accommodate far more data than enterprise-class SSDs, and for a lower cost. When you consider the volume of data that needs to be stored, and the possibility that video data may be written once and never altered, you can see why consumer-class SSD might be a viable choice in some instances.
For organizations that need the performance of SSD, but for whom MLC drives aren't an option, the best approach might be to use a hybrid storage array with an SSD tier. Such a device monitors data access and places hot data in the SSD tier, while cold data (or less frequently accessed data) is placed onto higher capacity but slower mechanical storage.
Organizations that need to stream high-definition video data must carefully consider their storage architecture. Solid-state storage is a particularly good option for ensuring good I/O performance, but it comes at a cost. Fortunately, there are some ways to curb those costs, such as using hybrid storage arrays or lower grade SSDs.
About the author
Brien Posey is a Microsoft MVP with two decades of IT experience. Before becoming a freelance technical writer, Brien worked as a CIO for a national chain of hospitals and healthcare facilities. He has also served as a network administrator for some of the nation's largest insurance companies and for the Department of Defense at Fort Knox.