Gary Grider can tell you exactly the type of storage he needs for his supercomputing needs at Los Alamos National Laboratory in New Mexico.
Grider, a group leader in high-performance computing (HPC) at Los Alamos, has implemented a global parallel file system
"If we left the scratch files on the system, they would fill it each day," says Grider. "Data is purged off of it frequently. We're more concerned about how fast it is, and how fast it is is related to how big it is, because the only way you get it to go fast is you parallelize it across lots of disks."
Grider's needs for a parallel global file system differs from those exacted from storage area networking in which a fraction of the file system is physically available on each node, and access to it is mediated by the network that controls access to all nodes. Grider needs a file system that can grow easily both in capacity and bandwidth.
"A disk drive will go at 50 to 100 megabytes per second, and we need bandwidth of tens to hundreds of gigabytes per second," says Grider. "We need to orchestrate a parallel copy from memory to the file system across thousands of disk drives. So we care more about how fast we can write to it than how big it is."
Many global parallel file systems rely on the Unix/Linux Network File System (NFS) for batch computing, which has a huge appetite for IOPS. But the NFS protocol has high overhead, which limits its use with I/O-intensive applications. For that reason, many HPC centers like Los Alamos have rejected it in favor of the Lustre clustered file system, which Sun Microsystems acquired in September of 2007.
We care more about how fast we can write to a file system than how big it is.
group leader in HPCLos Alamos National Laboratory
File systems such as Lustre provide parallel I/O to file systems, as well as the capability for single files to span multiple disks. As capacity is added, bandwidth scales and allows simultaneous access by hundreds to thousands of compute clients.
A typical cluster consists of multiple x86-based nodes interconnected to compute clients with 10Gb Ethernet, InfiniBand or a proprietary interconnect such as Myrinet or Quadrics. The storage nodes are aggregated under the parallel file system.
Changes to parallel file systems are due with the advent of parallel NFS (pNFS), which allows parallel and direct access between clients and storage devices. pNFS, which is set to be approved by the IETF, draws on Panasas' DirectFlow parallel storage protocol. Among the vendors supporting pNFS are Sun, IBM and NetApp.
About the author: Deni Connor is founder of Storage Strategies Now, a storage industry analyst firm in Austin, TX.