Feature

The benefits of clustered storage

Ezine

This article can also be found in the Premium Editorial Download "Storage magazine: iSCSI: Ready for prime time?."

Download it now to read this article plus other related content.

What's next?
Industry observers predict that more vendors will develop cluster or grid-based storage systems similar to the Google File System (GFS) developed by the search giant for its own massive storage needs. As described in a 2003 research paper called "The Google File System" by Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung, GFS is a "scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients."

Storage industry consultant Robin Harris praises GFS for its reliability, performance on large sequential reads, features such as automatic load balancing and storage pooling, and its low cost. But its shortcomings as a general-purpose storage platform include its "performance on small reads and writes, which it wasn't designed for and isn't good enough for general data center workloads," notes Harris.

Google has since built Bigtable, a distributed storage system for managing petabytes of structured data. It "provides very good performance for small reads and writes," says Jeff Dean, a Google Fellow in the Systems and Infrastructure Group.

Analysts and other observers differ about whether these mega-storage projects could serve as a foundation for commercial systems. If Google were starting today,

    Requires Free Membership to View

"we'd probably still build our own because I'm not aware of any system that scales to the sizes that we need at reasonable price/performance ratios," says Dean. Building its own systems, he says, gives Google "more flexibility because we can control the underlying storage system that sits underneath our applications."

Click here for a sampling of
clustered storage vendors (PDF).

In January, IBM announced plans to acquire XIV, which claims its grid-based architecture creates an unlimited number of snapshots in a very short time by replicating data among Intel-based servers running a custom version of Linux and linked by redundant Gigabit Ethernet switches. Because each node has its own processors, memory and disk, according to the company, CPU power and the memory available for cache operations increases as storage capacity rises. IBM says that by distributing each logical volume across the grid as multiple 1MB stripes, the architecture provides consistent load balancing even as the size of volumes or drive types on the grid changes.

The technology will be aimed at users running Web 2.0 applications and storing digital media. But speaking in a conference call sponsored by the Wikibon consulting community, storage consultant Josh Krischer pointed out that the system doesn't support mainframe connectivity and constitutes "another level of storage between the high end and the top of the midrange" in IBM's current storage offerings. Rather than being optimized for Web 2.0 storage, said Krischer, "this is general-purpose storage" that IBM will bring to the market at aggressive price points because of its use of industry-standard hardware and open-source software.

Architectures such as GFS "will be more of what you see in the future," says John Matze, one of the architects of the iSCSI protocol and VP of business development at IP SAN vendor Hifn Inc. As network bandwidth becomes less expensive and storage nodes become more intelligent, he predicts the rise of more cluster or grid-like storage environments in which individual nodes have the intelligence to recover from inevitable failures.

This was first published in April 2008

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: