For all the sweat, dollars and frustration that storage professionals have invested in implementing storage area...
networks (SANs), it's still not clear that SANs are delivering the full payback for these efforts.
|SAN File System Architecture|
Clients participating in SAN File System retrieve data in a two-part process. Clients equipped with the virtual file system request data from the metadata server, which provides a file's physical location, and issues locks. With that information, clients retrieve data from SAN-attached disks directly over the Fibre Channel SAN.
Sure, compared to direct-attached storage (DAS), SANs are easy to scale, and they deliver consolidated management and backup, without sacrificing performance. But at the same time, says Brian Truskowski, IBM general manager for storage software, "whenever I talk to customers, they report that they're not really getting the maximum benefit" from their SAN.
Helping storage administrators get more from their SANs was the stated intent of IBM Corp. when it announced its Storage Tank product last month, after five years of research and development. When it's released in December, Storage Tank--now sporting the more corporate-sounding moniker SAN File System (SAN FS)--will provide users with features such as a global namespace, storage pooling and policy-based management, file-based snapshot and easier data migration. But while all these features sound great, the real question is: Is enterprise IT ready for SAN FS--and is IBM ready to adapt SAN FS to meet the needs of enterprise IT?
From an architectural standpoint, SAN FS is what is sometimes referred to as a clustered file system, distributed file system, or shared SAN file system. It's much like a dozen or so existing products on the market, including Advanced Digital Information Corp.'s StorNext (formerly CentraVision, or CVFS), EMC's HighRoad, Silicon Graphics Inc.'s CXFS, Sistina's GFS, PolyServe's Matrix Server, Sun Microsystems' QFS, Veritas' Cluster File System and even IBM's own SANergy and GPFS.
The common thread running through SAN FS and many of the other clustered file systems on the market today is the basic architecture: an out-of-band metadata server and distributed lock manager, working in concert with installable file system code that resides on participating client machines. In the case of SAN FS, file requests from an application server travel first to the metadata server, which looks up file locations in its mapping table, issues locks and then sends that information back to the client. Armed with the block location and lock, the application server then retrieves the file from the storage directly over a Fibre Channel (FC) SAN.
|SAN file system at a glance|
IBM's SAN FS roadmap
IBM's grand plan is to make SAN FS a heterogeneous distributed file system that supports multiple Unix and Windows clients and storage arrays. In this first release, however, operating system support includes IBM's own AIX, AIX HACMP, Windows 2000 and Windows 2000 Advanced Server operating systems. As far as storage arrays are concerned, SAN FS supports IBM's Enterprise Storage Server (Shark), as well as storage virtualized behind the SAN Volume Controller (SVC), the block-based virtualization appliance that IBM announced last spring.
In 2004, IBM will extend operating system support to Linux, Solaris, Solaris clusters and Windows 2000 clusters. Last spring, IBM also released source code to SAN FS, such that third parties could write agents for additional operating systems. Next year, storage array support will move beyond IBM storage with the help of SVC, which was recently enhanced to support arrays from Hitachi Data Systems (Thunder 9200) and Hewlett-Packard (MA 8000, EMA 12000 and EMA 16000), according to Bruce Hillsberg, IBM director of storage software strategy.
SAN FS is built upon four major storage technologies. They are:
- Global namespace. All application servers running SAN FS share the same files, which are accessed and named in a single, common way. The main benefit of a global namespace is that the environment typically requires less capacity, because, as Truskowski puts it: "The data is the data." In other words, multiple applications can share a single instance of a file, rather than relying on copies. That in turn can reduce data movement, simplifying application workflow, and improving performance.
- Storage pools. SAN FS also incorporates the notion of storage pools, or the ability to create groups, or classes, of storage. For example, under SAN FS, an administrator can specify a high-end class of storage that provides mirrored failover, a midtier class for reference data, and a low-end tier comprised of low-cost JBOD for storing temporary files. Storage pools enable another major SAN FS feature: policy-based file space provisioning. Working from a central SAN FS administrative console, administrators can set policies across all application servers in the SAN FS environment, which can be set according to "a variety of variables" says IBM's Hillsberg, including file extensions, application server, user name and access time, for example. A nice extension of policy-based file management is the ability to establish soft or hard quotas across your storage pools, he says.
- File-based snapshot. Then, there's the SAN FS feature called File-based FlashCopy Image, a snapshot implementation. But unlike existing snapshot products on the market, which are block-based, FlashCopy Image doesn't just snap an entire volume, but can snap all the way down to a specific file or directory. "This brings the concept of point-in-time copy to a much more granular level," says Hillsberg, with numerous implications for data protection and backup.
- Data migration. SAN FS introduces the notion of Automated Volume Drain for Volume Migration. The translation: easier data migration. When it comes time to migrate data to a different storage platform, an administrator can specify which volumes to remove from a pool. SAN FS performs the data movement and keeps track of where a file resides at any given time. In other words, an application server requesting a file can still retrieve the data, even if it is in the process of being migrated.
That's in contrast to how clustered file systems have fared thus far. To date, most users of these file systems have been heavy consumers of storage working in say, the oil and gas industry, or in video production. What these industries have in common is their need to handle extremely large numbers of large files, as well as an inherent need to share them between users and application servers.
It's not that IBM will ignore the core technical and creative market--initial customer references include Johns Hopkins University and CERN's particle physics research lab--but IBM's Truskowski says SAN FS was also designed to provide "good performance for business applications."
|What's wrong with file storage?|
Targeting business users
Thus far, business applications, such as databases, have not mixed very successfully with shared file systems. That's because in business--where files are not nearly as large as in the aforementioned technical and creative markets--the latency introduced by the distributed lock manager overrides the performance of the SAN and the convenience of the global namespace.
Truskowski admits that its metadata server introduces latency--"that's a fact of life," he says. But according to former IBM chief scientist and SAN FS designer Randal Burns, now assistant professor at Johns Hopkins University, the way IBM architected its out-of-band clustered metadata server minimizes the latency issues that have plagued other clustered file system implementations.
Another problem of clustered file systems that IBM claims to have licked is the number of clients SAN FS can support. Professor Burns says that back in 1997, when he and other Storage Tank designers stood at the whiteboard, they originally envisioned a system that would support 10,000 clients, 1,000 servers and one petabyte (PB) of data. Contrast that to SGI's CXFS, which today limits the number of clients participating in its file system environment to 64.
Certainly, CERN expects to connect more than a handful of nodes with SAN FS. The particle physics research facility is exploring SAN FS to manage the data generated by its Large Hadron Collider (LHC), which is scheduled to be turned on in 2007. The collider, which will run for at least a decade, is expected to generate a jaw-dropping 10PB of data per year, largely in the form of image files approximately 1MB in size. These files document CERN researchers' attempts to recreate, on a small scale, what the universe may have looked like when it was first created. They will reside around the world on machines in the CERN grid where researchers will be able to sift through it "looking for odd events," says Francois Grey, CERN openlab development officer. "People talk about finding a needle in a haystack," he says. "This will be more like finding a needle in 10 million haystacks."
According to Grey, CERN openlab is always researching technologies that might be useful in its lab, including in the area of data management, and deems IBM's SAN FS as "the most promising" technology it has seen so far.
It stands to reason that if SAN FS will prove good enough for CERN to manage hundreds of petabytes of files, it should be more than adequate at managing the files of your average enterprise, most of which today turn to Network Appliance Inc. (NetApp) and EMC for large-scale file storage needs.
|The road to better backup|
Adding cost and complexity
But there are people out there that contend that SAN FS--and distributed file systems in general--add unnecessary cost and complexity to the business of storing and serving files. Whereas NAS systems are renowned for their reliability and ease of installation, distributed file systems are perceived as hard to implement, hard to keep running and for all but the biggest companies -- hard to pay for.
Randy Kerns, a senior analyst at the Evaluator Group, says one company he works with considered deploying a SAN file system, but dropped the project. The company had 80 or so servers that it was hoping to hook into the system. They calculated that after scheduling the necessary downtime to install the client-side code, installation alone would take about a year.
To be fair, one of SAN FS' central design points is its ability to coexist on an application server with another file system. IBM also provides data migration tools that can move data out of one file system and into SAN FS, on a per application basis, says Truskowski. "It's not an all-or-nothing proposition--you don't have to commit to do everything at once," he says, adding: "that wouldn't work for our enterprise customers."
But once the clustered file system is installed, are your troubles over? Probably not, at least in SAN FS' initial release. SAN file systems have the reputation, deserved or otherwise, of being finicky. "I won't kid you, this stuff is hard to do," says Paul Rutherford, vice president of technology at ADIC. "Especially when you start to add heterogeneous clients."
At SGI--which added support for non-SGI IRIX clients recently--Gabriel Broner, senior vice president and general manager of the storage and software group admits that "we had some rough times" ironing out compatibility issues, although he is now confident of CXFS' stability in a heterogeneous environment.
But Steve Kenniston, a technology analyst from the Enterprise Storage Group, argues that IBM is fully aware of these issues, and as "a trusted brand," is "at more of a risk if it doesn't work than if it were just a startup." In other words, IBM has a big incentive to ensure that everything works as advertised. And overall, Kenniston says, he was "pretty impressed" with the way IBM "took a look at all the major issues" including file-locking, multipathing, cache and timeouts.
And no one can accuse IBM of biting off more than it can chew in its first release. With release 1.0 limiting its support to IBM and Windows environments, initial SAN FS solutions will most likely be sold largely as bundled solutions, probably as part of a push to get customers to consolidate on Shark, Kenniston suggests. And make no mistake--this stuff won't come cheap, with many installations coming in at "a million dollars and north of that," says Kenniston.
Assume SAN FS installs without a hitch and runs perfectly. The question then becomes: Should you consider SAN FS, especially if you're not in the "lunatic market" as ADIC's Rutherford affectionately terms the technical computing markets that have traditionally deployed distributed file systems?
The one problem that SAN FS--and all distributed file systems--really do fix is the eternal "I love my first filer, I hate my tenth" problem. That cliche, usually applied to Network Appliance filers, arose because contemporary 32-bit file systems (and the NAS systems that are built on top of them) tend to top out around 2TB, explains IBM's Truskowski. And while that may have been a lot of capacity five years ago, these days, it's a drop in the bucket.
Sixty-four-bit file system technology, combined with a global namespace, can free users from these limitations. SGI's CXFS, for example, has a theoretical capacity limit of 18 exabytes--that's 18 million terabytes--although realistically, CXFS customers top out at capacities of hundreds of terabytes, says SGI's Broner.
| Another way to
manage a lot of files
Is NAS a better alternative?
In the world of storage as elsewhere, there are many different ways to skin a cat. And if it's file sharing that's your trouble, there are a lot of people out there that will argue that NAS is still your best bet.
That may be especially true these days, with several new technologies on tap that address the limitations of traditional NAS systems. Startups such as NuView Inc., Houston, TX, and Z-Force, Santa Clara, CA, for example, both offer software that can aggregate a NAS environment into one coherent view. And then there are NAS startups such as Isilon Systems Inc., Seattle, WA and Spinnaker Networks, Philadelphia, PA, run their proprietary NAS boxes on top of 64-bit clustered file systems. In Spinnaker's case, users can cluster 512 of its NAS boxes together in a single 11PB file system.
IBM's Truskowski, however, is dismissive of NAS for demanding environments. "The performance is lousy," he says. Assuming Gigabit Ethernet as your connection, you're moving data half the speed of today's 2Gb/s FC networks. "There's definitely a place for NAS," he says, "but trying to solve these sorts of problems with it is like sending out a corporal to do a general's job."
But for some industry observers, that's all bluff and bluster designed to support a technology that has a questionable role to play in the average enterprise data center. "There's no question that if Almaden [the IBM research lab where Storage Tank was developed] had come up with the idea for Storage Tank today rather than five years ago, it never would have been built," says one industry observer who asked not to be named. But as it stands, "there are people whose entire careers depend on it," he says. "They're going to shine this pig up and bring it out to the party." For this source, who is a veteran of the distributed file system world, the cost and complexity that SAN FS brings to an environment far outweigh its purported benefits.
ESG's Kenniston meanwhile, has a far rosier view of SAN FS' future--but that future is much farther out than IBM may have you believe. "If you look at the future, it's all about Linux-racked servers, morphed with some sort of [Veritas blade management software] Jareva, running Oracle 9i RAC, on top of a clustered file system. That's the future." But he adds, "That's another two to four years away."
When it comes to blade computing, Joaquin Ruiz, vice president of marketing at Sistina Software Inc., Minneapolis, MN, agrees with Kenniston that clustered file systems are critical to the evolution of blade computing. "In the '90s, the main reason for using clustered file systems was high availability, not performance," he says. But with the advent of blades, "you no longer have just one or two monolithic servers connected in the SAN--you have dozens," and the clustered file can act as "the control point" that lets you manage it all.
However, Sistina, like its competitor PolyServe Inc. in Beaverton, OR, eschews SAN FS' asymetric out-of-band metadata architecture, instead choosing to distribute metadata symmetrically on all the nodes in the cluster. That approach, says Michael Callahan, PolyServe founder and CTO, is similar to the one used in "classical VMS clusters," from now-defunct Digital Equipment Corporation, and is much better suited to the high-performance transactional applications that many companies envision running on their blade servers. If other clustered file systems that rely on an out-of-band asymetrical architecture are any indication, he says, it is questionable whether IBM's SAN FS will be able to deliver the performance and high availability requirements put forth by blade applications such as parallel databases.
iSCSI and Ethernet are two other factors that could figure into when and how clustered file systems get deployed. Says Paul Ross, director of storage network marketing at EMC: "Go out three years--Ethernet's at 10 Gig, and more people are running iSCSI than not. At that point, SAN file systems start to make a lot of sense because you don't have to deploy a separate infrastructure to run them. The SAN file system becomes a LAN file system."
But what about today? Is SAN FS the answer to storage managers' file storage needs? In large environments, quite possibly, says ESG's Kenniston. But he cautions, "In the IT world, there are the haves and the have-nots. This is for the haves."
Dig Deeper on SAN technology and arrays