To achieve this dynamic sharing of all reads and writes across all server and storage nodes, a CFS must possess sophisticated controls over what node has rights and access to a given piece of data at any given time. The technology within a CFS that plays traffic cop is called a distributed lock manager (DLM). A well-architected DLM enables a CFS to scale to dozens or hundreds of nodes, ensuring performance and total coherency of the data at all times. Today, most CFS have been architected with enterprise deployments in mind. As such, they place high emphasis on dynamic, transparent failover amongst server nodes and immediate recoverability of nodes without any data loss. It's no coincidence that CFS originated in the realm of high availability for databases (see "How to choose a file system," previous page).
Over the past few years, CFS have taken up residency alongside SAN deployments. While a SAN enables the pooling of storage resources, it does nothing to change how servers access that networked data: each file system still owns a piece of the SAN. However, a CFS in conjunction with a SAN actually pools application I/O and storage so that any server can access any of the data within the networked storage environment. CFS has difficulty when scaling to large node counts in highly parallel computing environments, such as large HPC deployments with thousands of servers. Aside from the lack of networked storage architectures in such deployments, the DLM architecture of a CFS can face serious performance issues at such large scales.
Notable companies leveraging CFS for enterprise workloads in NAS and SAN clustering include Advanced Digital Information Corp. (ADIC) with StorNext, IBM's SAN File System, PolyServe Inc.'s Matrix Server, Red Hat Inc.'s Sistina and Silicon Graphics Inc.'s Clustered XFS (CXFS).
DFS and PFS are receiving a lot of attention across a range of applications. The terms distributed and parallel are interchangeable; some vendors call their product distributed while others go with the "parallel" moniker--the two approaches are architecturally and functionally analogous.
DFS/PFS enables thousands of servers to sustain parallel I/O into a file system, directory or single file with minimal coordination required between those servers. All DFS/PFS are two-layer, file-system architectures with clients and servers. On the client layer, the DFS/PFS creates a namespace that spans all of the machines and creates a single file-system presentation. Because it establishes "one big file system," the client layer enables any client to make requests into the cluster that's executed by the server layer.
The server layer of a DFS/PFS is responsible for all I/O operations, and can span to more than 2,500 clients in large implementations (see "Verify vendor performance claims," previous page). From a data storage perspective, the server layer of a DFS/PFS is functionally identical to the storage layer, sometimes even referred to simply as the system's storage nodes. This is because every DFS/PFS is architected so that each individual physical server maintains ownership of its own storage resources. In a DFS or PFS, storage isn't directly shared by other servers, as is the case in a CFS. Because of that difference, a DFS/PFS doesn't need to use SAN networks for storage.
A DFS/PFS uses various internode daemons, meta data and data control mechanisms to ensure that stored content is accessed only by a single client at any given time, ensuring data coherency. While some approaches use a centralized lock manager and meta data server to achieve this traffic cop control, others use non-hierarchical or segmented lock management approaches to achieve extremely high scalability and parallelized I/O. The result is a file-system architecture optimized for huge throughput across many machines.
Because of this architecture, DFS/PFS made an initial beachhead in HPC cluster applications. DFS/PFS are being increasingly deployed in enterprises for data-intensive apps such as digital content delivery and scalable NAS (see "Where CFS and DFS/PFS fit best," above right). A shortcoming of most DFS/PFS implementations has been an inability to handle the random I/O common in workloads such as databases. This relative weakness results from how these architectures handle a bunch of traffic cop issues among large numbers of server I/O nodes.
Notable companies leveraging DFS or PFS technologies include Exanet Inc., IBM's General Parallel File System (GPFS), Ibrix Inc., Isilon Systems Inc. and Lustre, the open-source initiative.
Cluster vs. distributed/parallel
The key question faced by any user is: When should I care about a cluster file system vs. a distributed or parallel file system?
CFS are well designed for comparatively smaller node-count deployments in enterprise environments. Because of their integration with SAN architectures and their extreme focus on high availability and immediate recovery, a CFS is well suited to clustered databases and consolidation of file or application servers into a networked storage pool. Additionally, enterprise CFS have been increasingly coupled with NAS protocols (NFS, CIFS) to create scalable NAS environments or more accurately, "scalable NAS on SAN." Users should also consider CFS when high availability and recoverability for critical apps are absolutely essential.
DFS and PFS continue to dominate HPC environments. This is because they don't require SAN integration and are optimized for highly parallelized I/O to single clients. Additionally, this category of file-system technology is increasingly prevalent at the heart of scalable NAS offerings. These offerings typically leverage the DFS/PFS architecture to create an integrated plug-and-play cluster where every node behaves as an atomic member, contributing new CPU and storage to an overall system as they're added. Some DFS-based NAS offerings can achieve theoretical file-system sizes of more than 100TB and are demonstrating impressive real-world aggregate throughput. When client performance is critical or highly parallel I/O is a must, users should look to DFS/PFS solutions.
Both CFS and DFS/PFS will play important roles in the data center. CFS will undoubtedly continue to play a major role in enterprise virtualization initiatives because a CFS enables fine-grained scalability, and the sharing of multiple applications across all server and storage resources. Likewise, DFS/PFS will build traction as an engine for scalable NAS due to the easy atomic management capabilities these distributed file systems offer.
What's next?
Looking ahead 24 to 36 months, expect to see even more file-system innovations. Specifically, major vendors will integrate CFS technologies into their server and storage virtualization product families. Without easily deployed, transparent, shared file systems, all of the great utility computing dreams of major vendors will remain just dreams.
Accelerated by fast node-to-node interconnects, DFS/PFS will continue to boost performance across a range of I/O types, eventually challenging CFS for the lucrative clustered database market. This could have significant implications for the DBMS market as a whole when plug-and-play scalable databases become much easier to deploy.
Within three years, namespace aggregation technologies will become integrated features of servers, workstation file systems and within enterprise storage switching platforms. In short, unified namespaces will become ubiquitous, an essential part of an enterprise file-system deployment. Both CFS and DFS/PFS will also continue to extend their support for wide-area geographies. Data centers will begin to aggressively deploy both types of technologies to support branch offices for better collaboration and consolidation activities for file and block data.
The once lowly file system has finally blossomed into a state of increasingly diverse innovation. For storage professionals, the complexity has increased, but the good news is that file systems are finally starting to make their jobs easier.