Essential Guide

This Essential Guide is a collection of articles, videos and other content selected by our editors to give you a comprehensive view of this topic.

Choosing storage for streaming large files in big data sets

Big data in motion requires storage that can handle streaming and analytics efficiently. Learn how to avoid setbacks and build an effective infrastructure with this guide.

Storing hundreds of terabytes and even petabytes of data is no longer uncommon for organizations throughout a vast array of industries, but there's a big difference between big data sets composed of stagnant archived files and big data containing business-critical streaming media files. These are commonly referred to as big data at rest, and big data in motion.

When determining what type of storage should house a big data set, it's important to take into account how files will be used. While object storage and scale-out network-attached storage (NAS) are two of the most popular storage options for big data environments, media and entertainment companies that rely on data in motion shouldn’t overlook including higher-performance storage, such as solid-state drives (SSDs).

SearchStorage constructed this guide to help storage pros craft an architecture to house big data sets containing large streaming files. From the links provided throughout this guide, learn about the storage challenges you might encounter with big data and how to work around them, as well as the best way to perform analytics. You'll be well on your way to building the best architecture for your big data in motion.

Overview

1. Foundations of storing big data sets

Big data -- large amounts of unstructured data -- can yield positive business results when parsed effectively. But to get that competitive advantage, IT pros need to know how the technology works and what roadblocks they should anticipate in getting there. The obvious obstacle to storing big data is finding a platform that can house such a large amount of information, but what happens when large files, such as video and audio, require enough performance to stream? The following links provide insight on ways to deal with these problems, such as using compression or caching, and give tips on how storage type can make a difference.

Storage options

2. Exploring big data storage options

Obvious storage choices for many big data platforms include scale-out NAS, for the amount of capacity it has, and object storage, which can help with the unstructured nature of big data sets. But when streaming files, it can also be a good idea to look at high-performance storage, such as SSDs. The stories and podcasts below include examples of what a storage system should provide for big data, and when specific types of storage work the best in big data environments.

Analytics

3. Choosing storage to accommodate big data analytics

Analytics is one of the most important parts of big data, but it can cause a lag in performance. In addition, the type of storage used can affect analytics efficiency. To make big data analytics effective, storage technologies, such as in-memory data grids, and advances in Hadoop continue to evolve. View the links below to learn more about how storage type affects analytics, and which tools can help you gain the most insight from the contents of big data sets.

Video

4. Big data storage: Experts weigh in

Still looking for more insights into big data and choosing the right storage for streaming files? Check out the videos below, in which expert contributors discuss the changing big data market and how storage plays an important role.