Innovation isn't something one usually associates with the storage industry. While there have been significant improvements over the years in read/write head design, methods for detecting signal changes and even techniques for aligning bits on media, the fact is that disk drives are still just spinning rust -- albeit with constantly increasing capacity and steadily decreasing price per gigabyte (GB).
Fact: A 20 terabyte (TB) array costs about $5,200 in piece parts. By the time it reaches the consumer, even with minimal value-added software added, you are looking at somewhere in the neighborhood of $30,000 to $60,000. With a lot of "value-added" software and a maintenance agreement, you are confronting a $450,000 price tag.
Why? Simple. Value-added software and services is about the only way that storage hardware vendors can hope to continue to make any money from their wares. Former EMC CEO Michael Ruettgers said it in 2001, and it is still true today. With disk becoming commodity, software and services is the only way to sustain or improve margins. Thus, the 125% increase in the cost of an array year over year is less about storage hardware innovation than it is about slick marketing and technology lock-in strategies.
Some vendors claim that they are simply delivering to market what the consumer wants: a one-stop-shop product with a one-throat-to-choke warranty. It has lately become fashionable to describe products as "solutions" -- the last storage you will ever need to buy. Vendors are fond of describing their wares as "information lifecycle management in a box" or "tiered storage in a box" or "SAN in a box." Buying their "solution" spares the consumer the hassles of integrating multiple hardware and software products from multiple sources.
Are all of the software functions inside these "in-a-box" products "best of breed" or even a proper fit for all the data produced by all applications in your environment? Vendors to whom I put this question are unanimous in their response. The question is irrelevant: Customers don't care about application performance anymore -- or even storage performance for that matter. They just want to be "loved" -- cared for by the vendor the way that mommy cared for them when they were young.
Performance measures are for geeks, one vendor insists. Customers are too stupid to run a protocol analyzer to know what versions of firmware, interconnect software or interfaces are installed. They still don't get the reasons why RAID-5 is ineffectual with large capacity drives (because of long rebuild times in the event of a drive failure, for example). Instead, they hold onto old ideas about not paying for the same disk turf twice.
There are many examples to back up this vendor view. One CIO in Detroit told me this year that he doesn't want to manage technology, just a few vendors. For this reason, he only buys the "Cadillac brands" -- Cisco, EMC, etc. It gives him more time on the golf course as he closes in on retirement.
Look at the capacity allocation efficiency (a measure of how effectively storage is being doled out to applications and end users), and you might see the fellow's point. Buying from a "love-brand" vendor is usually the only way to get significant levels of capacity management from storage. If everything comes from one vendor, then the management utility that ships with the gear is usually (and there are exceptions to this rule) good enough to automate capacity allocation functions. Diversify platforms and the task of capacity management becomes more complicated.
When it comes to capacity utilization efficiency, however, this rule of thumb simply doesn't exist. Utilization efficiency has to do with placing the right data on the right spindle at the right time based on the value of the data itself and the cost of the storage device. We do an abysmally poor job of this, whether we buy homogeneous hardware or not.
Less than 41% of any given disk is probably used to an effective purpose (e.g., reading and writing of useful data), according to longitudinal data collected by Sun Microsystems' (formerly StorageTek's) storage assessment group across thousands of accounts. The balance of the disk is filled with junk or redundant data, or empty space that is reserved but not yet allocated.
If we had the tools, and we don't, to dig deeper into the contents of the 41% of disk that is hosting useful data, we would find that a much smaller percentage of that data, as low as 10% to17% by some estimates and, depending on the operating system controlling the storage, is actually positioned on the right spindles based on data value and storage costs.
The latter factor, cost, is hidden in the one-stop-shop products. Vendors like to state that their product is delivering storage at $X per gigabyte (substitute pennies or dollars for X). This may contribute to the "love" they are showing consumers by hiding the effective costs of their product, including the costs of locking out less expensive and more innovative approaches.
Case in point: The CIO for a New England pharmaceutical developer asked about a product he was preparing to buy to hold on to clinical trial data for FDA-mandated five to seven years. He didn't want to spend money on heating and air-conditioning for a bunch of heat-generating spindles, or to pay the electric bill for the additional infrastructure. Moreover, additional personnel to manage the new infrastructure were not in his five-year budget. He feared that the "solution" being offered by his love mark vendor did not meet these criteria.
We suggested an alternative: Bell Micro Hammer Z arrays, running Zetera Corp.'s IP storage technology, and Caringo's CAStor software to index data as it was being written to disk. The former would enable him to capture close to the commodity price of disk and to put the drives to sleep when not in use. The latter would enable him to implement a simple standards-based scheme of content addressing and to put nonrepudiability controls on the stored data itself, regardless of the arrays where he was going to write the data. This integration would be relatively hassle free and completely standards-based -- and would keep his options open for the next 10 years so he could capitalize on "best-of-breed" technology and falling storage costs.
Bottom line -- many users may lack the technical acumen to cobble together complex infrastructure and may find the one-stop-shop appealing. However, simply preferring brands with "love marks" to all alternatives is stupid. You don't need to be proficient in Iometer to make intelligent storage decisions. Sometimes, you just need to consider the relative costs of lock-ins and open platforms to make a smart buy.
About the author: Jon William Toigo is a Managing Partner for Toigo Productions. Jon has over 20 years of experience in IT and storage.