This article can also be found in the Premium Editorial Download "Storage magazine: The benefits of storage on demand."
Download it now to read this article plus other related content.
|Where does disk performance live?|
4. Storage bandwidth doubles every 36 months
I/O is stuck at the same 25% annual growth of pre-MR disk. And most disks can't keep up with their rated speeds in sustained throughput tests.
If we divide the size of an average disk by the theoretical speed of its interface, we come up with the amount of time it would take to read the entire contents of a disk. Let's call this flush time. This metric highlights a serious problem for enterprise storage. Whether you look at interface speeds or actual transfer rates (see "Where does disk performance live?"), disk performance is lagging behind capacity improvements.
Enterprise storage makers attempt to make up for the performance gap in two ways. First, they use RAID to spread data across many spindles, increasing the performance. Second, they refuse to use the latest and largest drives in performance-sensitive applications.
RAID is really just a bandage over the performance gap. Add up the space on all the disks and divide it by their total I/O capability, and the flush time remains the same as a single disk. The only way to make a really high-performance system is to use disks that can deliver the goods in terms of performance. Right now, that means using only 15,000 rpm and 73GB or smaller drives.
5. Enterprise drives are I/O bound, consumer drives aren't
In 1994, enterprise storage used a 20MB/s SCSI bus, while consumer PCs lagged with 16MB/s IDE interfaces. Enterprise drives now use quick 200MB/s (2Gb) Fibre Channel (FC) interfaces, while consumer PCs are just starting to adopt 150MB/s Serial ATA. So, although they use different interfaces, consumer drive interfaces have remained about 80% as fast as those found in the enterprise. Actual throughput speed has followed the same pattern.
When you plot flush time vs. disk size over time, you see the problem for enterprise storage. In order to ensure adequate performance, enterprise storage users haven't adopted larger disks like the consumer market has.
Consumer disks aren't limited by I/O performance. My home PC takes more than two hours to flush the entire contents. By contrast, a typical 36GB 15,000 rpm FC drive in an enterprise array would be able to sustain 60MB/s or more, for a flush time of just 10 minutes.
Consumers are buying larger drives with ever-poorer performance, and reaping the benefits of a Moore's Law-like exponential increase in capacity. However, the enterprise is much more sensitive to performance and is lagging behind in realizing the drop in per-megabyte disk costs.
What's the long-term implication of this finding?
Enterprise storage prices won't fall as fast as the vast improvements in storage density would indicate, unless performance is sacrificed. Within a year or two, we could even see the recent cost reductions for enterprise storage stop. It will be difficult to make ever-larger drives perform better, even if their spindle speeds are increased. And the small disks used by enterprise storage will become more expensive as their similarity to consumer disks disappears.
But if we leverage the performance of a large number of huge disks across a large number of users, we can have the benefits of density without sacrificing performance. That kind of consolidation is much more than most storage architectures are delivering today. We need a new kind of massively parallel storage system with thousands of storage nodes serving thousands of users. And hope that all of those users don't want to be served at the same time!
Aspects of this new architecture are materializing today. Systems such as 3PAR's InServ and Panasas' ActiveScale spread data across more spindles than most current RAID implementations. More and more systems are starting to leverage large consumer-grade disks and interfaces.
Does the "I" in RAID stands for independent or inexpensive? Current enterprise storage clearly isn't inexpensive. If these new storage architectures take over, it seems that both terms might finally be appropriate.
This was first published in June 2004