Industries that once operated in traditional paper-based models are being overwhelmed by their digital data stores. Scale-out NAS can provide high-performance application support
Before now, you might not have considered your IT shop as one that needs a scale-out system. But scale-out systems -- which started out in the network-attached storage (NAS) space because some industries rely on very large files that require a lot of bandwidth to meet performance requirements -- have far-reaching implications for a varied number of environments today.
For instance, several major industries that once operated in traditional paper- or microfilm-based modes are finding that their digital data stores are threatening to overwhelm them. These are attractive vertical markets for scale-out NAS vendors that can provide high-performance application support.
VERTICAL AFFINITY FOR SCALE-OUT NAS
Enlarge VERTICAL AFFINITY FOR SCALE-OUT NAS diagram.
If we look at the throughput vs. I/O axis model in the “Vertical affinity for scale-out NAS” graphic, these industries have many applications that require the very high throughput the parallel data services found in many scale-out NAS systems (and coming next year with pNFS) can deliver, exceeding the MB per second capabilities of traditional scale-up NAS systems.
As recently as five years ago, this chart would have looked very different. Many of the workloads in the upper right would have been crowded into the left-hand side of the chart. But advances in processor technology -- such as multicore processing and much faster chip sets -- and video, graphics and design software -- such as 3-D CAD, 4-D medical imaging and high-definition TV, to name just a few -- have created new types of workloads that demand a very different performance profile. They create huge files and multithreaded requests that a single- or dual-processor scale-up system wouldn’t be able to service in a timely manner, causing production to slow or the system to time out waiting for the request.
Let’s take a deeper dive into a few verticals to illustrate my point.
Financial services. These users, who are accustomed to managing extremely large volumes of transactional information, are also now heavy users of high-performance parallel file systems for efforts such as market-performance forecasting and business intelligence. These efforts involve files that aren’t just big, they’re also long running, compute-intensive, and require a high level of data protection and immediate data availability. Financial services users in particular look for scale-out architectures that remove data integration bottlenecks. Data integration is a core task in financial services IT. For these users, an ideal NAS solution is one that performs faster as the number of nodes increases.
Life sciences. Not surprisingly, organizations engaged in health-related scientific discovery are actively interested in parallel file system solutions offering high-bandwidth data transfer and massive scalability. At these organizations, collaboration at an intensive level is typically evident. For example, the IT team may need to find ways to enable sharing of very large gene-sequencing files or proteomic data across thousands of researchers. To be successful, these companies must accelerate their discovery processes; the faster they develop a new drug, the faster it can be tested and approved for use in real-world medical and scientific applications. One IT-centric way for such organizations to accelerate a drug discovery process is to use a high-performance parallel file system infrastructure that never requires disruptive forklift upgrades.
Manufacturing and design. High-tech manufacturers, aerospace companies, nano-electronics startups, CAD/CAM design firms and many others also need tremendous amounts of storage. And they’re all looking for ways to optimize data management. Users in these industries need faultless capacity expansion to handle digital growth and improve information sharing among engineering teams. Outages are economically damaging in these environments, so users in the manufacturing and design segment seek to deploy file-based storage that offers near-total reliability and easy capacity upgrades on the fly. They look for automation to assist with file-system administration, data movement, replication and migration/tiering.
Media and entertainment. The operating model of media and entertainment organizations has evolved dramatically. In years past, they perhaps produced print magazines that are now available in an “online-only” format. Not only does all editorial content need to be quickly available to readers and content generators, but all the advertising files do, too. Large video files are also exacerbating the data growth problems at digitally intensive media and entertainment companies.
Today’s media and entertainment organizations are generating and protecting terabytes or petabytes of file data. At some enterprises, much of the data is created at the “edge” -- in remote news bureaus or CGI design studios separated from main data centers. That operational structure brings problems related to data replication for backup and can even impede the disaster recovery (DR) capability of the infrastructure. Media and entertainment organizations are looking at high-performance scale-out NAS solutions to solve a variety of problems; for instance, to improve the performance of a virtual server infrastructure or to ensure that information is instantly and always available to content creators and consumers.
Oil and gas. Uncovering oil and gas reserves was once a guessing game. Today, it’s a precise, scientific endeavor that relies on digitized data. Three-dimensional visualization to spot possible resources has become an ever-present tool for the industry as fields decline and extraction operations become more complex. IT managers working in the oil and gas vertical market are challenged to find NAS infrastructures that can support the sharing and protection of the huge data sets resulting from oil reserve modeling/simulation work. Without an architecture that can maintain performance as data storage capacity grows, sustaining a competitive edge becomes more difficult, mainly because the “time-to-result” (the extraction of the resource) lengthens. Scale-out NAS is a good solution for oil and gas organizations dealing with enormous computational simulations that, in a very direct fashion, hold the key to their competitive success.
Traditional high-performance computing/academics and research. Astrophysicists, molecular biologists, chemists, nuclear physicists and even social scientists working in the public sector are heavy generators and consumers of data. For example, at the Large Hadron Collider run by CERN, the team in charge of IT was managing 70 PB of storage by mid-2010. Even far smaller research facilities (usually working in cost-constrained university settings or commercial labs) rely on high-performance grid computing and parallel file system architectures to support modeling and simulation efforts that could solve real-world problems and answer big questions. Their work requires low-latency network clusters that can handle extremely intensive performance and bandwidth demands.
These industries were the first real adopters of scale-out systems because they absolutely needed the performance capabilities scale-out systems provide in terms of throughput. But a majority of shops should realize the efficiency and operational savings that come with storing many petabytes of data within a single namespace. That’s why scale-out systems are finding a home in cloud infrastructures, allowing companies like Gluster (recently acquired by Red Hat) -- which offers a scale-out file system that runs on commodity hardware and can support block, file and object data -- to gain big interest from cloud-based businesses and enterprises building private clouds. Enterprise Strategy Group forecasts that 80% of all external NAS systems revenue will come from scale-out system shipments by 2015, and both “big file data” and cloud will be at the core of that growth.
BIO: Terri McClure is a senior storage analyst at Enterprise Strategy Group, Milford, Mass.