Life was much easier in the 90s. We had block storage and file storage, and each had their place: block for highly transactional data and file for unstructured and departmental storage.
By the end of the 90s, network-attached storage (NAS) design improved performance enough to make it suitable for running Oracle databases. Administrators preferred easy-to-manage file storage to complex block storage with dedicated Fibre Channel SAN switches.
But with the turn of the century, new technologies for storage devices and architectures multiplied. Unified storage emerged, combining block and file storage. First generation multi-node scale-out NAS also emerged, improving scalability but compromising small file performance. When scale-out NAS design couldn't keep pace with Web-scale requirements, object storage was developed, adding global scale-out but relinquishing easy file access.
Instead of redesigning legacy scale-out NAS products, most of the industry continues to focus on object storage. Unfortunately, object-based storage doesn't provide the high-performance, enterprise-grade, POSIX-compliant file access that thousands of legacy applications require. It also fails to provide a performance level that can meet the requirements of many big data workloads like media and entertainment, life sciences and commercial HPC. Object storage companies are trying to address these issues by adding file gateway accelerators in front of object-based back-ends. But that approach adds another layer of complexity, which leaves the door open for a modernized enterprise-capable, scale-out NAS design that can meet enterprise performance requirements and scale as well as object storage.
Ideal scale-out NAS design principle
If we had a clean design sheet for a modern modular scalable NAS design, it would likely include the following attributes:
- Flash-first design. No other technology in the last 10 years has done more to challenge the paradigm of traditional storage designs than solid-state storage. When flash-based drives first appeared, they were an expensive luxury and used sparingly. Now flash is ubiquitous. Flash enables many new storage attributes including enhanced metadata, removal of battery-backed cache, deduplication performance and data-tiering that actually makes sense.
- Data-aware enhanced metadata. Metadata is no longer limited to saving precious persistent memory space or IOPS constrained by spinning media. New storage devices should expand metadata approaches and give storage an extra boost of brains. For scale-out NAS, this enables the device to become data-aware and provides a rich set of real-time analytics at scale. Time- and performance-consuming file system tree walks, metadata scans and file system lookups are eliminated as metadata aggregates are updated and stored in real-time.
- Massive scalability without compromising performance. Object storage vendors have said that you cannot use NAS for big data because it doesn't scale. Until now, they were right. Traditional scale-out NAS becomes bogged down at hundreds of millions of files, which leans toward workloads of very large files and leaves high-performance NAS to NetApp and EMC. With flash-first design and enhanced metadata techniques, modern scale-out NAS should be able to scale to tens of billions of files (a >100X improvement) with uncompromised performance for both large and small files.
- Software-defined design. Since flash-first design eliminates special hardware requirements, scale-out NAS software should be portable and able to run on industry-standard servers. This allows the storage to be deployed on the latest hyper-scale architectures or even run on a virtual machine in a public cloud. This approach ensures that scale-out NAS can be deployed using the same hardware and economics as object storage making the product extremely cost-effective.
- SaaS software delivery model. While modernizing NAS design, we should go to the extreme and deliver software as you would modern cloud applications. Software as a service builds an invisible and perpetual upgrade process into the product. It eliminates the tedious qualification effort driven by old architectures where quality had to be baked in with months of testing. While enterprise customers still require strict control and acceptance of new software releases, it's time to adopt a more agile development model for storage products.
- Open APIs. Open APIs are required to unlock the value of the advanced data services, analytics and metadata capabilities. Making features easily programmable and controllable using a modern RESTful approach will allow this next generation of scale-out NAS to be easily be integrated into both legacy and future cloud-centric environments.
Emerging vendors with modern scale-out NAS design
While most new scale-out storage companies are building products based on grid-based object-storage design principles, a few are tackling the arguably harder problem of enterprise performance-grade scale-out NAS design.
Qumulo was founded by many of the original inventors of Isilon OneFS. The focus of Qumulo technology was to bring massive scalability at uncompromised performance. In addition to breakthrough scalability, Qumulo focuses on making data visible through real-time capacity and performance analytics that make management of petabytes of storage a breeze. Qumulo is focused on HPC and large-scale unstructured data workloads in media and entertainment, life sciences, higher education, oil and gas, and others.
InterModal Data was founded by former executives of NetApp, Sun and other legacy storage leaders. The company delivers scalable, flexible and efficient scale-out storage for enterprises through distributed system software on top of disaggregated hardware architecture. Their approach physically separates and logically connects I/O nodes from capacity nodes over Ethernet, using RAM and flash caching on every node.
Scality initially started out focusing on object storage technology, but has now expanded to address the requirement for scale-out NAS that coexists seamlessly with object storage. The company delivers scalable, flexible and efficient scale-out storage for enterprises via a software-only approach tunable for both high performance workloads and cost-effective archive storage.
New life for NAS
It's been well over a decade since major changes have appeared in the scale-out NAS product category, so it's refreshing to see these companies take a fundamentally new approach. If they can radically improve scale and performance while adding a rich set of analytical capabilities, many customers will be drawn to these products just as easy file access combined with enterprise performance attracted users in the 90s.
Definitive look at NAS system design
Is NAS right for you?