IBM unveils storage tank

With SANs growing, should you be thinking of how to leverage them for files? SAN FS, IBM's new clustered file system, is one option and we examine it in detail. Is enterprise IT ready for SAN FS and will IBM be ready to meet the needs of enterprises?

This article can also be found in the Premium Editorial Download: Storage magazine: Expanding SANs: How to scale today's storage networks:

For all the sweat, dollars and frustration that storage professionals have invested in implementing storage area networks (SANs), it's still not clear that SANs are delivering the full payback for these efforts.

SAN File System Architecture

Clients participating in SAN File System retrieve data in a two-part process. Clients equipped with the virtual file system request data from the metadata server, which provides a file's physical location, and issues locks. With that information, clients retrieve data from SAN-attached disks directly over the Fibre Channel SAN.

Sure, compared to direct-attached storage (DAS), SANs are easy to scale, and they deliver consolidated management and backup, without sacrificing performance. But at the same time, says Brian Truskowski, IBM general manager for storage software, "whenever I talk to customers, they report that they're not really getting the maximum benefit" from their SAN.

Helping storage administrators get more from their SANs was the stated intent of IBM Corp. when it announced its Storage Tank product last month, after five years of research and development. When it's released in December, Storage Tank--now sporting the more corporate-sounding moniker SAN File System (SAN FS)--will provide users with features such as a global namespace, storage pooling and policy-based management, file-based snapshot and easier data migration. But while all these features sound great, the real question is: Is enterprise IT ready for SAN FS--and is IBM ready to adapt SAN FS to meet the needs of enterprise IT?

From an architectural standpoint, SAN FS is what is sometimes referred to as a clustered file system, distributed file system, or shared SAN file system. It's much like a dozen or so existing products on the market, including Advanced Digital Information Corp.'s StorNext (formerly CentraVision, or CVFS), EMC's HighRoad, Silicon Graphics Inc.'s CXFS, Sistina's GFS, PolyServe's Matrix Server, Sun Microsystems' QFS, Veritas' Cluster File System and even IBM's own SANergy and GPFS.

The common thread running through SAN FS and many of the other clustered file systems on the market today is the basic architecture: an out-of-band metadata server and distributed lock manager, working in concert with installable file system code that resides on participating client machines. In the case of SAN FS, file requests from an application server travel first to the metadata server, which looks up file locations in its mapping table, issues locks and then sends that information back to the client. Armed with the block location and lock, the application server then retrieves the file from the storage directly over a Fibre Channel (FC) SAN.

SAN file system at a glance
Who: IBM Corp.
What SAN File System, previously known as Storage Tank.
When Incepted in 1998 at IBM's Almaden Research Lab, announced in October 2003; generally available in December 2003.
Where Initially in IBM environments, including AIX, AIX HACMP and Windows 2000 clients, on IBM Enterprise Storage Server (Shark) storage and behind IBM's SAN Volume Controller.
Why To bring SAN users the benefits of a global namespace, storage pools, file-based snapshot and improved data migration.
How much Starting list price of $90,000, includes SAN File System software; two IBM eServer xSeries 345 Metadata Server engines; one eServer xSeries 305 server for the master console; IBM Director and Tivoli BonusPack for SAN management under 64 ports; one rack-mounted display, keyboard and mouse combination and required power cords. Additional Metadata Server engines are priced at $16,000.

IBM's SAN FS roadmap
IBM's grand plan is to make SAN FS a heterogeneous distributed file system that supports multiple Unix and Windows clients and storage arrays. In this first release, however, operating system support includes IBM's own AIX, AIX HACMP, Windows 2000 and Windows 2000 Advanced Server operating systems. As far as storage arrays are concerned, SAN FS supports IBM's Enterprise Storage Server (Shark), as well as storage virtualized behind the SAN Volume Controller (SVC), the block-based virtualization appliance that IBM announced last spring.

In 2004, IBM will extend operating system support to Linux, Solaris, Solaris clusters and Windows 2000 clusters. Last spring, IBM also released source code to SAN FS, such that third parties could write agents for additional operating systems. Next year, storage array support will move beyond IBM storage with the help of SVC, which was recently enhanced to support arrays from Hitachi Data Systems (Thunder 9200) and Hewlett-Packard (MA 8000, EMA 12000 and EMA 16000), according to Bruce Hillsberg, IBM director of storage software strategy.

SAN FS is built upon four major storage technologies. They are:

  • Global namespace. All application servers running SAN FS share the same files, which are accessed and named in a single, common way. The main benefit of a global namespace is that the environment typically requires less capacity, because, as Truskowski puts it: "The data is the data." In other words, multiple applications can share a single instance of a file, rather than relying on copies. That in turn can reduce data movement, simplifying application workflow, and improving performance.
  • Storage pools. SAN FS also incorporates the notion of storage pools, or the ability to create groups, or classes, of storage. For example, under SAN FS, an administrator can specify a high-end class of storage that provides mirrored failover, a midtier class for reference data, and a low-end tier comprised of low-cost JBOD for storing temporary files.
  • Storage pools enable another major SAN FS feature: policy-based file space provisioning. Working from a central SAN FS administrative console, administrators can set policies across all application servers in the SAN FS environment, which can be set according to "a variety of variables" says IBM's Hillsberg, including file extensions, application server, user name and access time, for example. A nice extension of policy-based file management is the ability to establish soft or hard quotas across your storage pools, he says.
  • File-based snapshot. Then, there's the SAN FS feature called File-based FlashCopy Image, a snapshot implementation. But unlike existing snapshot products on the market, which are block-based, FlashCopy Image doesn't just snap an entire volume, but can snap all the way down to a specific file or directory. "This brings the concept of point-in-time copy to a much more granular level," says Hillsberg, with numerous implications for data protection and backup.
  • Data migration. SAN FS introduces the notion of Automated Volume Drain for Volume Migration. The translation: easier data migration. When it comes time to migrate data to a different storage platform, an administrator can specify which volumes to remove from a pool. SAN FS performs the data movement and keeps track of where a file resides at any given time. In other words, an application server requesting a file can still retrieve the data, even if it is in the process of being migrated.
It's Big Blue's hope that SAN FS will attract a significant number of users to its clustered file system, both in the technical and high-performance computing markets, as well as in commercial markets running transactional applications.

That's in contrast to how clustered file systems have fared thus far. To date, most users of these file systems have been heavy consumers of storage working in say, the oil and gas industry, or in video production. What these industries have in common is their need to handle extremely large numbers of large files, as well as an inherent need to share them between users and application servers.

It's not that IBM will ignore the core technical and creative market--initial customer references include Johns Hopkins University and CERN's particle physics research lab--but IBM's Truskowski says SAN FS was also designed to provide "good performance for business applications."

What's wrong with file storage?
1 It's difficult to implement consistent policies because each application server has a separate file system.
2 Expanding or contracting a file system typically results in application downtime.
3 Copying and moving data between servers to satisfy a workflow is a complex, error-prone process.
4 Separate file systems for each application server results in poor capacity utilization.

Targeting business users
Thus far, business applications, such as databases, have not mixed very successfully with shared file systems. That's because in business--where files are not nearly as large as in the aforementioned technical and creative markets--the latency introduced by the distributed lock manager overrides the performance of the SAN and the convenience of the global namespace.

Truskowski admits that its metadata server introduces latency--"that's a fact of life," he says. But according to former IBM chief scientist and SAN FS designer Randal Burns, now assistant professor at Johns Hopkins University, the way IBM architected its out-of-band clustered metadata server minimizes the latency issues that have plagued other clustered file system implementations.

Another problem of clustered file systems that IBM claims to have licked is the number of clients SAN FS can support. Professor Burns says that back in 1997, when he and other Storage Tank designers stood at the whiteboard, they originally envisioned a system that would support 10,000 clients, 1,000 servers and one petabyte (PB) of data. Contrast that to SGI's CXFS, which today limits the number of clients participating in its file system environment to 64.

Certainly, CERN expects to connect more than a handful of nodes with SAN FS. The particle physics research facility is exploring SAN FS to manage the data generated by its Large Hadron Collider (LHC), which is scheduled to be turned on in 2007. The collider, which will run for at least a decade, is expected to generate a jaw-dropping 10PB of data per year, largely in the form of image files approximately 1MB in size. These files document CERN researchers' attempts to recreate, on a small scale, what the universe may have looked like when it was first created. They will reside around the world on machines in the CERN grid where researchers will be able to sift through it "looking for odd events," says Francois Grey, CERN openlab development officer. "People talk about finding a needle in a haystack," he says. "This will be more like finding a needle in 10 million haystacks."

According to Grey, CERN openlab is always researching technologies that might be useful in its lab, including in the area of data management, and deems IBM's SAN FS as "the most promising" technology it has seen so far.

It stands to reason that if SAN FS will prove good enough for CERN to manage hundreds of petabytes of files, it should be more than adequate at managing the files of your average enterprise, most of which today turn to Network Appliance Inc. (NetApp) and EMC for large-scale file storage needs.

The road to better backup
Could a clustered file system help improve a company's backup? To hear vendors speak of it, yes, absolutely. That said, not a single vendor could provide a customer to support that idea.

In theory, a clustered file system enables server-less backup. According to Paul Ross, EMC's director of storage network marketing, a clustered file system like EMC's HighRoad/Celerra combo enables server-less backup by letting the backup server access data directly off of disk without impacting the application server.

Ross says that server-less backup is one of the main reasons customers deploy its HighRoad software outside of the high-performance computing and creative markets. "It's one [application] that we hadn't really thought of, that customers figured out on their own," Ross adds.

ADIC's Paul Rutherford, vice president of technology, says some of his company's customers in the financial industry are using StorNext file system to support centralized backup in a geographically distributed environment.

But according to W. Curtis Preston, founder of The Storage Group and author of Unix Backup and Recovery, if it's happening, he's not aware of it. "Very few people are using clustered/SAN file systems period, let alone to assist with server-free backup. I suppose it could be done, but I don't know any backup software designed to take advantage of it."

The one exception may be Tivoli Storage Manager, IBM's backup suite, which according to Brian Truskowski, IBM's general manager for storage software, will support SAN FS.

Adding cost and complexity
But there are people out there that contend that SAN FS--and distributed file systems in general--add unnecessary cost and complexity to the business of storing and serving files. Whereas NAS systems are renowned for their reliability and ease of installation, distributed file systems are perceived as hard to implement, hard to keep running and for all but the biggest companies -- hard to pay for.

Randy Kerns, a senior analyst at the Evaluator Group, says one company he works with considered deploying a SAN file system, but dropped the project. The company had 80 or so servers that it was hoping to hook into the system. They calculated that after scheduling the necessary downtime to install the client-side code, installation alone would take about a year.

To be fair, one of SAN FS' central design points is its ability to coexist on an application server with another file system. IBM also provides data migration tools that can move data out of one file system and into SAN FS, on a per application basis, says Truskowski. "It's not an all-or-nothing proposition--you don't have to commit to do everything at once," he says, adding: "that wouldn't work for our enterprise customers."

But once the clustered file system is installed, are your troubles over? Probably not, at least in SAN FS' initial release. SAN file systems have the reputation, deserved or otherwise, of being finicky. "I won't kid you, this stuff is hard to do," says Paul Rutherford, vice president of technology at ADIC. "Especially when you start to add heterogeneous clients."

At SGI--which added support for non-SGI IRIX clients recently--Gabriel Broner, senior vice president and general manager of the storage and software group admits that "we had some rough times" ironing out compatibility issues, although he is now confident of CXFS' stability in a heterogeneous environment.

But Steve Kenniston, a technology analyst from the Enterprise Storage Group, argues that IBM is fully aware of these issues, and as "a trusted brand," is "at more of a risk if it doesn't work than if it were just a startup." In other words, IBM has a big incentive to ensure that everything works as advertised. And overall, Kenniston says, he was "pretty impressed" with the way IBM "took a look at all the major issues" including file-locking, multipathing, cache and timeouts.

And no one can accuse IBM of biting off more than it can chew in its first release. With release 1.0 limiting its support to IBM and Windows environments, initial SAN FS solutions will most likely be sold largely as bundled solutions, probably as part of a push to get customers to consolidate on Shark, Kenniston suggests. And make no mistake--this stuff won't come cheap, with many installations coming in at "a million dollars and north of that," says Kenniston.

Assume SAN FS installs without a hitch and runs perfectly. The question then becomes: Should you consider SAN FS, especially if you're not in the "lunatic market" as ADIC's Rutherford affectionately terms the technical computing markets that have traditionally deployed distributed file systems?

The one problem that SAN FS--and all distributed file systems--really do fix is the eternal "I love my first filer, I hate my tenth" problem. That cliche, usually applied to Network Appliance filers, arose because contemporary 32-bit file systems (and the NAS systems that are built on top of them) tend to top out around 2TB, explains IBM's Truskowski. And while that may have been a lot of capacity five years ago, these days, it's a drop in the bucket.

Sixty-four-bit file system technology, combined with a global namespace, can free users from these limitations. SGI's CXFS, for example, has a theoretical capacity limit of 18 exabytes--that's 18 million terabytes--although realistically, CXFS customers top out at capacities of hundreds of terabytes, says SGI's Broner.

Another way to
manage a lot of files
With the rights to license over 70 million images, and more coming every day, Corbis in Seattle, WA, is the poster child of users with large shared file requirements.

And while the company did consider clustered file system technology, reports Alex Taylor, manager of systems engineering, they ultimately decided against it, because while "it would have helped with the administration, it wouldn't have helped with the engineering." He adds: "There's a lot of thinking that goes into a SAN."

After grappling with about 25TBs of SAN storage, this winter Corbis decided to change course, and purchased 10TBs of storage from NAS startup Isilon Systems, on to which it gradually moved its image data. The group subsequently purchased another 13TBs, for 23TBs total.

For administrators, the change has been dramatic. Whereas it used to take between three and four weeks to provision new SAN storage, provisioning Isilon storage only takes two days. That's from the moment he places a call to Isilon, to the moment the storage is available.

As a result, Corbis has been able to reclaim its SAN "for what it was designed to do: SQL databases."

Is NAS a better alternative?
In the world of storage as elsewhere, there are many different ways to skin a cat. And if it's file sharing that's your trouble, there are a lot of people out there that will argue that NAS is still your best bet.

That may be especially true these days, with several new technologies on tap that address the limitations of traditional NAS systems. Startups such as NuView Inc., Houston, TX, and Z-Force, Santa Clara, CA, for example, both offer software that can aggregate a NAS environment into one coherent view. And then there are NAS startups such as Isilon Systems Inc., Seattle, WA and Spinnaker Networks, Philadelphia, PA, run their proprietary NAS boxes on top of 64-bit clustered file systems. In Spinnaker's case, users can cluster 512 of its NAS boxes together in a single 11PB file system.

IBM's Truskowski, however, is dismissive of NAS for demanding environments. "The performance is lousy," he says. Assuming Gigabit Ethernet as your connection, you're moving data half the speed of today's 2Gb/s FC networks. "There's definitely a place for NAS," he says, "but trying to solve these sorts of problems with it is like sending out a corporal to do a general's job."

But for some industry observers, that's all bluff and bluster designed to support a technology that has a questionable role to play in the average enterprise data center. "There's no question that if Almaden [the IBM research lab where Storage Tank was developed] had come up with the idea for Storage Tank today rather than five years ago, it never would have been built," says one industry observer who asked not to be named. But as it stands, "there are people whose entire careers depend on it," he says. "They're going to shine this pig up and bring it out to the party." For this source, who is a veteran of the distributed file system world, the cost and complexity that SAN FS brings to an environment far outweigh its purported benefits.

ESG's Kenniston meanwhile, has a far rosier view of SAN FS' future--but that future is much farther out than IBM may have you believe. "If you look at the future, it's all about Linux-racked servers, morphed with some sort of [Veritas blade management software] Jareva, running Oracle 9i RAC, on top of a clustered file system. That's the future." But he adds, "That's another two to four years away."

When it comes to blade computing, Joaquin Ruiz, vice president of marketing at Sistina Software Inc., Minneapolis, MN, agrees with Kenniston that clustered file systems are critical to the evolution of blade computing. "In the '90s, the main reason for using clustered file systems was high availability, not performance," he says. But with the advent of blades, "you no longer have just one or two monolithic servers connected in the SAN--you have dozens," and the clustered file can act as "the control point" that lets you manage it all.

However, Sistina, like its competitor PolyServe Inc. in Beaverton, OR, eschews SAN FS' asymetric out-of-band metadata architecture, instead choosing to distribute metadata symmetrically on all the nodes in the cluster. That approach, says Michael Callahan, PolyServe founder and CTO, is similar to the one used in "classical VMS clusters," from now-defunct Digital Equipment Corporation, and is much better suited to the high-performance transactional applications that many companies envision running on their blade servers. If other clustered file systems that rely on an out-of-band asymetrical architecture are any indication, he says, it is questionable whether IBM's SAN FS will be able to deliver the performance and high availability requirements put forth by blade applications such as parallel databases.

iSCSI and Ethernet are two other factors that could figure into when and how clustered file systems get deployed. Says Paul Ross, director of storage network marketing at EMC: "Go out three years--Ethernet's at 10 Gig, and more people are running iSCSI than not. At that point, SAN file systems start to make a lot of sense because you don't have to deploy a separate infrastructure to run them. The SAN file system becomes a LAN file system."

But what about today? Is SAN FS the answer to storage managers' file storage needs? In large environments, quite possibly, says ESG's Kenniston. But he cautions, "In the IT world, there are the haves and the have-nots. This is for the haves."

This was first published in November 2003

Dig deeper on Storage vendors

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close