Five rules to help you build a storage strategy


This article can also be found in the Premium Editorial Download "Storage magazine: The benefits of storage on demand."

Download it now to read this article plus other related content.

Where does disk performance live?

    Requires Free Membership to View

There are quite a few metrics for disk drive performance, and just as many potential bottlenecks. Remember, though, no single metric tells the whole story, and many quoted metrics can be downright deceptive.

Spindle speed The most often mentioned measure of performance among storage architects is spindle speed. Most folks realize that 15,000 rpm is going to be better than 10,000, and this is the single greatest factor in drive throughput. The downside is that it's hard to pack the same high-storage density on a drive platter spinning at a high speed.
Seek time The speed at which a drive can locate a single bit of information is its seek time. Seek time is related to spindle speed, head actuator speed and the performance of the controller on the drive. A drive with low seek time will generally have lower latency between receiving and responding to a request, but the metrics can be misleading.
Transfer rate The sustained rate at which data can be read from, or written to, a disk is critical for some applications. This is a measured metric rather than a theoretical one, but is dependent on many factors, from spindle speed to the performance of the connected computer.
Interface speed Most enterprise drives are equipped with Fibre Channel connectors capable of delivering 200MB/s, but none can sustain this level of performance. Most drives are equipped with an interface two or three times faster than their native transfer rate.
Cache size A large on-disk cache of RAM can greatly improve performance for some workloads. Most drives today have 8MB of cache, which seemed like a lot on a 36GB drive, but is dwarfed by a 300GB unit!

4. Storage bandwidth doubles every 36 months

I/O is stuck at the same 25% annual growth of pre-MR disk. And most disks can't keep up with their rated speeds in sustained throughput tests.

If we divide the size of an average disk by the theoretical speed of its interface, we come up with the amount of time it would take to read the entire contents of a disk. Let's call this flush time. This metric highlights a serious problem for enterprise storage. Whether you look at interface speeds or actual transfer rates (see "Where does disk performance live?"), disk performance is lagging behind capacity improvements.

Enterprise storage makers attempt to make up for the performance gap in two ways. First, they use RAID to spread data across many spindles, increasing the performance. Second, they refuse to use the latest and largest drives in performance-sensitive applications.

RAID is really just a bandage over the performance gap. Add up the space on all the disks and divide it by their total I/O capability, and the flush time remains the same as a single disk. The only way to make a really high-performance system is to use disks that can deliver the goods in terms of performance. Right now, that means using only 15,000 rpm and 73GB or smaller drives.

5. Enterprise drives are I/O bound, consumer drives aren't

In 1994, enterprise storage used a 20MB/s SCSI bus, while consumer PCs lagged with 16MB/s IDE interfaces. Enterprise drives now use quick 200MB/s (2Gb) Fibre Channel (FC) interfaces, while consumer PCs are just starting to adopt 150MB/s Serial ATA. So, although they use different interfaces, consumer drive interfaces have remained about 80% as fast as those found in the enterprise. Actual throughput speed has followed the same pattern.

When you plot flush time vs. disk size over time, you see the problem for enterprise storage. In order to ensure adequate performance, enterprise storage users haven't adopted larger disks like the consumer market has.

Consumer disks aren't limited by I/O performance. My home PC takes more than two hours to flush the entire contents. By contrast, a typical 36GB 15,000 rpm FC drive in an enterprise array would be able to sustain 60MB/s or more, for a flush time of just 10 minutes.

Ominous implications
Consumers are buying larger drives with ever-poorer performance, and reaping the benefits of a Moore's Law-like exponential increase in capacity. However, the enterprise is much more sensitive to performance and is lagging behind in realizing the drop in per-megabyte disk costs.

What's the long-term implication of this finding?

Enterprise storage prices won't fall as fast as the vast improvements in storage density would indicate, unless performance is sacrificed. Within a year or two, we could even see the recent cost reductions for enterprise storage stop. It will be difficult to make ever-larger drives perform better, even if their spindle speeds are increased. And the small disks used by enterprise storage will become more expensive as their similarity to consumer disks disappears.

But if we leverage the performance of a large number of huge disks across a large number of users, we can have the benefits of density without sacrificing performance. That kind of consolidation is much more than most storage architectures are delivering today. We need a new kind of massively parallel storage system with thousands of storage nodes serving thousands of users. And hope that all of those users don't want to be served at the same time!

Aspects of this new architecture are materializing today. Systems such as 3PAR's InServ and Panasas' ActiveScale spread data across more spindles than most current RAID implementations. More and more systems are starting to leverage large consumer-grade disks and interfaces.

Does the "I" in RAID stands for independent or inexpensive? Current enterprise storage clearly isn't inexpensive. If these new storage architectures take over, it seems that both terms might finally be appropriate.

This was first published in June 2004

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: