This article can also be found in the Premium Editorial Download "Storage magazine: Survey says storage salaries are climbing."
Download it now to read this article plus other related content.
Advanced file-system approaches
Some of the more advanced challenges in the data center require file-system technologies that are even more complex than those discussed thus far. The following sections provide a more in-depth look at some of the cutting-edge developments in cluster file systems (CFS) and distributed file systems/parallel file systems (PFS).
|Verify vendor performance claims|
File-system vendors typically rely on two basic metrics to describe and compare their products' performance: operations per second (Ops/sec) and throughput. Ops/sec is an I/O measurement reflecting how many write-and-read actions a file system can handle vs. a given benchmarking suite. When vendors mention Ops/sec, they need to disclose the benchmark against which this measurement was achieved, otherwise the measurement is meaningless.
Throughput is typically measured in megabytes or gigabytes per second (MB/sec or GB/sec), indicating the total amount of data the file system--or the entire computing system based on that file system--can produce in a given time period. For users evaluating file-system software, it's important to determine the hardware used for throughput benchmarking to ensure a fair comparison to their own data center's hardware. Because file-system benchmarks (Iometer, IOzone) and related NAS protocol benchmarks (NetBench, SPECsfs) are necessarily broad and general in nature, vendors do whatever they can to make their product look better than competing products.
To achieve this dynamic sharing of all reads and writes across all server and storage nodes, a CFS must possess sophisticated controls over what node has rights and access to a given piece of data at any given time. The technology within a CFS that plays traffic cop is called a distributed lock manager (DLM). A well-architected DLM enables a CFS to scale to dozens or hundreds of nodes, ensuring performance and total coherency of the data at all times. Today, most CFS have been architected with enterprise deployments in mind. As such, they place high emphasis on dynamic, transparent failover amongst server nodes and immediate recoverability of nodes without any data loss. It's no coincidence that CFS originated in the realm of high availability for databases (see "How to choose a file system," previous page).
Over the past few years, CFS have taken up residency alongside SAN deployments. While a SAN enables the pooling of storage resources, it does nothing to change how servers access that networked data: each file system still owns a piece of the SAN. However, a CFS in conjunction with a SAN actually pools application I/O and storage so that any server can access any of the data within the networked storage environment. CFS has difficulty when scaling to large node counts in highly parallel computing environments, such as large HPC deployments with thousands of servers. Aside from the lack of networked storage architectures in such deployments, the DLM architecture of a CFS can face serious performance issues at such large scales.
Notable companies leveraging CFS for enterprise workloads in NAS and SAN clustering include Advanced Digital Information Corp. (ADIC) with StorNext, IBM's SAN File System, PolyServe Inc.'s Matrix Server, Red Hat Inc.'s Sistina and Silicon Graphics Inc.'s Clustered XFS (CXFS).
This was first published in November 2005