Products that support the parallel Network File System have been so slow to emerge that the performance- and scalability-boosting...
technology fell off the radar screens of many data storage industry observers. There are signs, however, that the long-delayed spec could become a viable option starting next year, as support expands with additional storage arrays and server operating systems.
EMC Corp. and NetApp Inc., two of the main contributors to the parallel Network File System standards effort, have pNFS-capable storage products, and another pioneering advocate, Panasas Inc., plans to add support early next year. IBM, which wrote significant portions of the Linux pNFS client and server, is working to layer the pNFS file layout on top of its General Parallel File System, or GPFS. Another major vendor, Hitachi Data Systems Corp., lists pNFS on its roadmap.
Because server operating systems, or clients, also need to support pNFS to facilitate adoption, attention has focused on the pNFS client in Red Hat Enterprise Linux (RHEL), the most popular commercially supported Linux distribution. Red Hat Inc. made available a technology preview for file-based storage last year, and expects to release a fully supported version next year with its upcoming RHEL 7.0 release.
In the meantime, IT pros eager to kick the pNFS tires can also try SUSE Linux Enterprise Server (SLES) Service Pack 2 (SP2), which was released in February. Other Linux options include Fedora, the Red Hat-backed community project edition of Linux, and CentOS, a derivation of RHEL, but those distributions lack the commercial product support that enterprises typically demand.
"The code is pretty stable in the [Linux] kernel," said Sorin Faibish, chief scientist for the Fast Data Group at EMC and a prominent pNFS evangelist. "Of course, there will always be bugs, and that's why the [Linux] distros are reluctant, because they want to wait for stability. But the code is working. We run performance tests. Everybody runs a lot of tests. We find bugs and fix them as fast as we can in the kernel. So, we are in much better shape than last year."
PNFS has played a critical role in the evolution of network-attached-storage architectures from NAS to clustered storage to parallel storage.
One of the main advantages a pNFS-based system has over traditional NFS is that the application servers, or clients, can gain simultaneous access in parallel over multiple data paths to storage servers or nodes. A metadata server out of the data path, logically and in some cases also physically separate from the storage servers, supplies the client with a map or layout of the location of the data. The client then can read and write data directly to the storage, boosting performance and scalability. "You achieve wire speed. If you're using 40-gig [Ethernet] or InfiniBand, you can achieve whatever speed the pipe gives you," Faibish said. "With NFS, you never achieve more than 70%."
In traditional NFS, the application servers and clients access the file system through a single network point at the network-attached-storage head/metadata server for both data and metadata. The setup creates the potential for a bottleneck, in particular, with clusters of application servers. Data-intensive applications that often make use of clusters include genomics, seismic processing, data mining, content and video distribution, and high-performance computing (HPC). Some users try multiple NFS servers as a workaround, but that scenario means extra work and additional storage management costs.
Although any IT shop potentially could benefit from pNFS, industry analysts expect early pNFS use cases to focus on HPC, big data analytics and cloud computing. Workloads with either lots of small files or extremely large files that run on compute clusters could see the greatest advantage from the simultaneous, parallel data access that pNFS-based systems provide, according to the Storage Networking Industry Association.
Arun Chandrasekaran, a research director at Stamford, Conn.-based Gartner Inc., said research and educational institutions will be the earliest testers because pNFS addresses their requirements for high IOPS and bandwidth. Such industries as media and entertainment, as well as the life sciences, likely will follow within a few years. Chandrasekaran, however, doesn't expect general enterprise-type deployments until 2014 or 2015, he said.
Chandrasekaran views the slow shift from scale-up to scale-out architectures and the movement toward parallel processing as fundamental and important trends, but it's unclear if pNFS is the technology that will best capitalize on those trends, he said. "It's still a work in progress," he added. "How big a deal is pNFS in the long run? It's dependent on support from vendors and the kind of features that are added."
With a pNFS-based system, the separation of the NFS control path for the metadata server from the storage data path facilitates the choice of protocol; and the pNFS standard supports file, block and object storage, in contrast to traditional NFS-based file storage. So far, however, vendor support has been spotty for pNFS block and object storage.
How big a deal is pNFS in the long run? It's dependent on support from vendors and the kind of features that are added.
research director, Gartner Inc.
For instance, EMC supported pNFS for block protocol access two years ago in Celerra and last year in its VNX array, but has not disclosed plans for file and object protocol access. NetApp added pNFS support for file protocol access in September, 2011 with its Data Ontap 8.1 operating system in cluster mode. Panasas is due to support the object protocol through a downloadable software update for server-side pNFS. In the meantime, the company claims its systems are "pNFS-ready."
"We're making sure that the pNFS client and our pNFS server are as good as our proprietary DirectFlow client and server," said Brent Welch, chief technology officer at Panasas. "pNFS shares the architecture of DirectFlow, but it's a new implementation and we're still making sure that the Linux client is production-ready against our server."
On the client side, Red Hat released a technology preview of its pNFS client for file layout last December with RHEL 6.2, but its Global Support Services organization doesn't support the feature. RHEL 6.3, released in June 2012, retained the technology preview. The company has yet to decide when it will fully support pNFS in RHEL 6 and whether it will add block and object layout types, according to Ric Wheeler, manager and architect of Red Hat's file system team.
Wheeler said RHEL 7.0, targeted for general availability in the second half of next year, will support the pNFS client for file, block and object layout types. He added that the company has no active plans to support pNFS server in RHEL and instead will explore server support with hardware partners, independent software vendors and its Red Hat Storage.
SLES 11 SP2, released in February, provided pNFS support for file layout but not the block or object protocols. Matthias Eckermann, a senior product manager at SUSE Linux Products GmbH in Nurnberg, Germany, said a future release may support block and object types. The timing will depend on the availability of code upstream in the Linux kernel, he noted.
Support for pNFS in Windows Server remains uncertain. Siddhartha Roy, a principal group program manager at Microsoft, declined comment on potential support in Windows products, and said simply that the company is evaluating interest in industry adoption of pNFS. Microsoft provided funding to the University of Michigan's Center for Information Technology Integration (CITI) for the development of required aspects of an NFS v4.1 client, but the work did not include the optional pNFS extension, Roy said.
Storage vendors, sometimes with assistance from CITI, have had to pitch in and help with the open source pNFS client work because they won't make much headway without it. EMC, for one, saw the importance of a stable client last year during beta tests with Fedora 14. The performance was acceptable, but not as good as it was with EMC's proprietary Multi-Path File System (MPFS) software, which uses a proprietary File Mapping Protocol, or FMP, to map file requests to the corresponding blocks, Faibish said. "Customers won't move to something that is inferior, even if it was for free," he said.
Server and client enhancements have since improved performance to the point that customers should see little difference between MPFS and pNFS. At times, pNFS outperforms MPFS because the metadata is more efficiently managed, Faibish said. The pNFS-based system would also have the advantage of eliminating the need for proprietary client software, with its inherent installation, maintenance and management problems.
"Our intention is to replace our current products with pNFS," Faibish said. EMC, however, will preserve MPFS because customers who use Windows Server still want MPFS for acceleration, and Windows doesn't yet support pNFS, he said.
The degree to which vendors will eliminate their proprietary products and protocols and shift to standards-based pNFS remains to be seen. But there is no evidence of a mad rush to pNFS by vendors or end users.
Randy Kerns, a senior strategist at Evaluator Group in Boulder, Colo., said it might take three years for IT shops to develop a comfort level that pNFS is mature enough to use. "Adoption represents change. Change is done slowly," he said. "General IT is the most conservative of all, so there may be opportunity for pNFS, but the adoption will take a much longer time."