Five rules to help you build a storage strategy

Metcalfe's law explains how the Internet works. But what about storage? Here are five rules to help you build a storage strategy.

This article can also be found in the Premium Editorial Download: Storage magazine: The benefits of storage on demand:

In 1965, Gordon Moore of Intel Corp. noticed that the density of transistors on integrated circuits was doubling approximately every two years. In the 1980s, Robert Metcalfe of 3Com Corp. noted that the value of a network increases exponentially as more people use it.

Let's take a stab at defining a few axioms for storage.

1. Density increases, but unit price stays constant

Prices tend to remain constant from generation to generation, but functionality increases with time. The average cost of capacity drops as density increases. The price of a specific disk drive model or tape cartridge starts high, and then eventually drops. But the next generation comes in at the same high initial unit cost.

Enterprise disk drives usually start at about $1,300, and then drop below $400 as they are phased out. Tape media starts at $200 and falls to $50. Consumer disk drives usually start near $500 and fall off the shelves at $80 (before rebates).

2. Backup tapes are three times larger than disks

Enterprise tape and disk units maintain a rough ratio of capacity of approximately three or four to one. In 1994, the DLT2000 had 15GB (native) capacity, and most enterprise arrays were just adopting 4GB disks. Today's 160GB to 300GB tapes map to 36GB to 140GB enterprise disks.

Backup tapes have to be somewhat larger than disks to stay in the storage game. If they fell behind, disk-based backup would take over.

3. Storage density doubles every 18 months

This axiom sounds like Moore's Law, but not really. Most computer components benefit from Moore's exponential growth of transistors, but disk doesn't rely on transistor counts for increased density. Different science and engineering allows us to put more and more on a small magnetic platter.

Disk capacity had been growing a steady 25% annually. But the introduction of the magneto resistive (MR) disk head in 1993 changed everything. Suddenly, magnetic disks were matching--and sometimes exceeding--transistor density growth. Since then, storage density has grown 60% annually, giving us the same 18-month doubling cycle as the semiconductor industry.

Enterprise and consumer flush time

Where does disk performance live?
There are quite a few metrics for disk drive performance, and just as many potential bottlenecks. Remember, though, no single metric tells the whole story, and many quoted metrics can be downright deceptive.

Spindle speed The most often mentioned measure of performance among storage architects is spindle speed. Most folks realize that 15,000 rpm is going to be better than 10,000, and this is the single greatest factor in drive throughput. The downside is that it's hard to pack the same high-storage density on a drive platter spinning at a high speed.
Seek time The speed at which a drive can locate a single bit of information is its seek time. Seek time is related to spindle speed, head actuator speed and the performance of the controller on the drive. A drive with low seek time will generally have lower latency between receiving and responding to a request, but the metrics can be misleading.
Transfer rate The sustained rate at which data can be read from, or written to, a disk is critical for some applications. This is a measured metric rather than a theoretical one, but is dependent on many factors, from spindle speed to the performance of the connected computer.
Interface speed Most enterprise drives are equipped with Fibre Channel connectors capable of delivering 200MB/s, but none can sustain this level of performance. Most drives are equipped with an interface two or three times faster than their native transfer rate.
Cache size A large on-disk cache of RAM can greatly improve performance for some workloads. Most drives today have 8MB of cache, which seemed like a lot on a 36GB drive, but is dwarfed by a 300GB unit!

4. Storage bandwidth doubles every 36 months

I/O is stuck at the same 25% annual growth of pre-MR disk. And most disks can't keep up with their rated speeds in sustained throughput tests.

If we divide the size of an average disk by the theoretical speed of its interface, we come up with the amount of time it would take to read the entire contents of a disk. Let's call this flush time. This metric highlights a serious problem for enterprise storage. Whether you look at interface speeds or actual transfer rates (see "Where does disk performance live?"), disk performance is lagging behind capacity improvements.

Enterprise storage makers attempt to make up for the performance gap in two ways. First, they use RAID to spread data across many spindles, increasing the performance. Second, they refuse to use the latest and largest drives in performance-sensitive applications.

RAID is really just a bandage over the performance gap. Add up the space on all the disks and divide it by their total I/O capability, and the flush time remains the same as a single disk. The only way to make a really high-performance system is to use disks that can deliver the goods in terms of performance. Right now, that means using only 15,000 rpm and 73GB or smaller drives.

5. Enterprise drives are I/O bound, consumer drives aren't

In 1994, enterprise storage used a 20MB/s SCSI bus, while consumer PCs lagged with 16MB/s IDE interfaces. Enterprise drives now use quick 200MB/s (2Gb) Fibre Channel (FC) interfaces, while consumer PCs are just starting to adopt 150MB/s Serial ATA. So, although they use different interfaces, consumer drive interfaces have remained about 80% as fast as those found in the enterprise. Actual throughput speed has followed the same pattern.

When you plot flush time vs. disk size over time, you see the problem for enterprise storage. In order to ensure adequate performance, enterprise storage users haven't adopted larger disks like the consumer market has.

Consumer disks aren't limited by I/O performance. My home PC takes more than two hours to flush the entire contents. By contrast, a typical 36GB 15,000 rpm FC drive in an enterprise array would be able to sustain 60MB/s or more, for a flush time of just 10 minutes.

Ominous implications
Consumers are buying larger drives with ever-poorer performance, and reaping the benefits of a Moore's Law-like exponential increase in capacity. However, the enterprise is much more sensitive to performance and is lagging behind in realizing the drop in per-megabyte disk costs.

What's the long-term implication of this finding?

Enterprise storage prices won't fall as fast as the vast improvements in storage density would indicate, unless performance is sacrificed. Within a year or two, we could even see the recent cost reductions for enterprise storage stop. It will be difficult to make ever-larger drives perform better, even if their spindle speeds are increased. And the small disks used by enterprise storage will become more expensive as their similarity to consumer disks disappears.

But if we leverage the performance of a large number of huge disks across a large number of users, we can have the benefits of density without sacrificing performance. That kind of consolidation is much more than most storage architectures are delivering today. We need a new kind of massively parallel storage system with thousands of storage nodes serving thousands of users. And hope that all of those users don't want to be served at the same time!

Aspects of this new architecture are materializing today. Systems such as 3PAR's InServ and Panasas' ActiveScale spread data across more spindles than most current RAID implementations. More and more systems are starting to leverage large consumer-grade disks and interfaces.

Does the "I" in RAID stands for independent or inexpensive? Current enterprise storage clearly isn't inexpensive. If these new storage architectures take over, it seems that both terms might finally be appropriate.

This was first published in June 2004
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close