Spend less on storage

Serial ATA disks can save you big bucks, but there's a bigger story here. By using RAID and a SAN, you can overcome many of its inherent reliability and performance limitations. It's time to rethink many of your assumptions about storage costs.

This article can also be found in the Premium Editorial Download: Storage magazine: Low-cost storage pieces fall into place:

The advancements in disk drive technology have been as influential to the progress of business computing as microprocessors. Without the capacity and performance advances we've come to take for granted, it would be impossible to run the majority of today's applications and operating systems.

Enter SAS
There's another new serial disk drive interface being developed called serial-attached SCSI (SAS). The original idea behind SAS was to create another small I/O network that supported SCSI commands, as opposed to ATA. The initial problem with SAS was that Fibre Channel already provided SCSI compatibility for both external and internal interfaces and there was really no need for another disk drive interface.

However, the SAS backers saw the advantages of piggybacking on the work already done on serial ATA (SATA) and worked with the SATA organization to create a specification that incorporates SATA connectivity to SAS controllers. The idea is that a single controller could connect to SATA and as SAS disk drives. From my perspective, I'm not sure how commonly this will be done. SATA drives will provide most of the performance and reliability needed and applications justifying high-performance storage can probably also justify dedicated storage subsystems tuned to the application.

At this point, it's hard to say if SAS and SATA controllers will coexist in the market, or if one will eat the other's opportunity. The primary advantage of SAS will be the ability to connect low-cost SATA disk drives. If SAS has problems with SATA connectivity, it probably won't survive.

It's almost ridiculous to propose how large-scale storage could be accomplished without disk drives providing the fundamental storage function. There are other potential technologies that could be used in place of disk drives, such as nonvolatile memory and optical recording devices, but disk drives have the best combination of reliability, performance, cost and operating requirements.

The last company that attempted to create a new storage technology was TeraStor, based in Milipitas, CA. It failed to deliver competitive technology after burning through $80 million of venture funding several years ago.

There's nearly universal interoperability among disk drives, which means customers can choose products largely based on price. The competitive pressures and the imperative to develop new, inexpensive drive technologies were chronicled in Clayton Christensen's groundbreaking book, The Innovator's Dilemma, which theorizes how new, disruptive technologies create lower-cost price trajectories for replacement products. In his book, Christensen uses the disk drive industry to show how the rapid churn rate of disruptive technologies put the disk drive industry in the low-margin dungeon where it is today.

Continued improvements in existing disk drive technologies that boost performance and reliability are permitting organizations to start migrating their storage plants from expensive server-class drives to less expensive desktop drives without much risk. Along with the changes in drive technologies, there are two key enterprise storage technologies facilitating this migration: RAID and storage area networks (SANs). A third technology, serial ATA (SATA) is a new disk drive interface that, with RAID and SANs, will allow IT organizations to achieve more of their strategic storage goals at far less of a cost.

Drive classes
Historically, disk drives gravitated into three distinct product classes: desktop, server and laptop. Desktop drives were designed for mass-production with slower performance and moderate duty usage. Server drives were built for the rigors of transaction processing and round-the-clock duty cycles. Finally, laptop drives were made to fit the physical demands of portable computers. From a business perspective, desktop drives have the highest sales volumes and lowest margins, while server drives represent the lowest sales volumes and the highest margins. Server drives typically sell for three to four times as much as similar capacity desktop drives.

While the three product classes were distinct, technologies invented for one class of drive are often integrated into the other classes. For instance, a volume manufacturing process developed for desktop drives can be applied to the production processes of server drives to reduce costs. Conversely, a performance technology developed for server drives can be adopted by desktop product families. Over time, the lower-performing desktop drives inherit the high-performance capabilities of server drives, as the server drives rise to new performance levels.

Competitive pressures between drive manufacturers for the high-volume desktop market nearly guarantee that new performance technologies developed for server drives make their way into desktop drives within two to three years. This year, we're seeing desktop drives reach the 10,000 rpm plateau--a performance specification that had previously only been available for server-class drives.

Generally, desktop drives sold today compare favorably with the performance of server drives that were sold three years ago. By comparison, the performance demands of many applications haven't increased as much. For example, the I/O performance requirements for Windows Exchange servers haven't increased much, if at all, in the same time frame. To use Clayton Christensen's terminology, server-class drives overshoot the requirements for most applications. The performance of today's desktop drives is more than adequate for most server applications.

ATA vs. SCSI disk performance
In a converged environment, NAS gateways sit alongside block-savvy application servers to enable access to back-end SAN storage. But companies that opt for a multivendor solution often struggle to find a mix of products that will work well together.

Another way to view reliability
There's a certain healthy paranoia existing regarding disk drive reliability. After all, most IT professionals have had serious problems at some time in their past with the loss of a disk drive. But the real requirement is for data availability--not necessarily drive reliability. If other technologies and techniques can be used to achieve data availability, then the reliability of disk drives isn't quite as critical.

One of the fundamental technologies increasing data availability is RAID. Since most businesses can't afford interruptions because of an individual disk drive failure, IT professionals have widely adopted RAID with disk subsystems that support hot-swappable disk drives and hot sparing. The important point is that data availability is provided at a higher level than the disk drive by products such as volume management software and RAID controllers. Just as space allocation is provided as a high-level function by file systems, access to data is provided at a greater level by software and/or controller technology.

Even so, reliability is a primary requirement for disk drives and disk drive vendors are accustomed to competing on reliability metrics. The market punishes vendors who sell drives of substandard quality by avoiding their products. While it's true that server drives are designed for more demanding usage than desktop and laptop drives, it's wrong to say that desktop drives are manufactured to poor quality standards. Most of the components inside disk drives are identical across server and desktop product families today. There's no reason to believe that desktop drives can't be used in enterprise-quality disk subsystems with higher-level data availability features.

The interfaces used for server and desktop drives have been one of the ways disk drive manufacturers have distinguished the two classes of drives. Server drives are made with SCSI or Fibre Channel (FC) interfaces, whereas desktop drives today use the ATA interface. While there are some real cost differences between the interfaces, they are relatively small, and not nearly as great as the cost difference between two similar disk drives would indicate.

How SATA is being Implemented
Serial ATA (SATA) is being implemented in three phases:

SATA 1.0: is complete, with products shipping today.

SATA II Phase 1: (also called current extensions) is complete, with products starting to ship in late 2003 or early 2004.

SATA II Phase 2: (also called future extensions) specification is expected to be completed in 2004, with products shipping the following year.

SATA 1.0 was intended primarily as a single-system implementation of the technology, a replacement for legacy ATA products. Its transfer rate is 150MB/sec.

SATA II Phase 1 addressed server requirements by expanding the connectivity to support backplane connections for disk subsystems, as well as supporting enclosure services for monitoring and managing the subsystem environment. In addition, SATA II added support for command queuing--a technique that supports more efficient disk operations in multitasking environments. Most of the points made in this article about SATA assume SATA II Phase 1.

SATA II Phase 2 introduces dual-porting and support for redundant controllers for higher availability. In addition, it increases the scalability of attached devices as well as doubling the transfer rate to 3MB/sec.

One of SATA's advantages for enterprise storage is its planned support for dual porting in SATA II Phase 2. Dual porting is the foundation of multipathing at the disk drive level, which provides the mechanism for a subsystem to failover to a secondary connection with a disk drive, should the primary connection fail.

There's no way the cost difference between similarly featured server and workstation drives can be justified based on interface alone, although that may be the most distinguishing feature between two drives. Server drives are sold for much higher margins as part of large-capacity server systems or subsystems. The extra money paid for server SCSI drives isn't fully realized by the disk drive vendors, but is shared by the system and subsystem companies.

Network-attached storage (NAS) companies such as Network Appliance Inc. (NetApp), Sunnyvale, CA, have been selling NAS systems with desktop-class ATA drives inside. NetApp's NearStore is based almost completely on the aggressive value proposition of ATA drives to achieve a high capacity at a lower cost per gigabyte.

Although there aren't many external ATA-based SAN storage subsystems on the market, that situation is going to change in the next couple years with the introduction of SATA in enterprise-class storage subsystems.

The idea behind enterprise SATA-based subsystems is to use SCSI or FC external interfaces to connect to host systems and to use SATA drives and interfaces inside the subsystem. This is similar to the situation with SAN storage subsystems that have external FC or iSCSI interfaces.

As it turns out, SANs are also contributing to the acceptability of ATA desktop drives for enterprise applications. For starters, SANs make it easy to set up disk mirroring between two subsystems, each with their own redundancy protection. This technique is commonly being used to build higher availability I/O networks.

In previous articles for Storage, I explained how open- systems store-and-forward software would reduce the cost of remote business continuity implementations. By allowing storage subsystems to be located greater distances from each other, SANs increase data availability in response to a local disaster without any dependence on the disk drive interfaces. Data redundancy through remote mirroring or store-and-forward technologies can take advantage of any type of disk drive interface by using subsystems with multiple controllers. The concept of using cheaper desktop drives for remote or second-tier copies of data has broad appeal for many IT organizations that need low-cost disk subsystems to achieve their business-continuity goals.

Another application targeted for SATA drives has been for disk-based backup and recovery. In general, user requirements for backup have little to do with the disk drive technology and more to do with the backup software and metadata systems responsible for restoring individual files and versions of files. Most proponents of disk-based backup overlook the fact that file systems on disk drives don't lend themselves to storing historical versions or snapshots of files--with the possible exception of the write anywhere file layout (WAFL) file system from NetApp. A backup system that can't store and restore multiple versions of data objects is uninteresting as a replacement for backup. The cost advantages of SATA drives aren't enough to obliterate the fundamental requirement for restoring data.

Disk drive price comparisons

Source: Disk drive prices collected from StreetPrices.com and Pricegrabber.com.

However, once the software issues have been dealt with, cheaper disk drives will dominate in disk-based backup.

It's not accurate to say disk drive interfaces don't make a difference, because they obviously do. The data availability and scalability features of a disk subsystem depend on the characteristics of the disk interface.

SATA is influencing the evolution of storage subsystems. SATA transforms the limited legacy direct-attached storage (DAS) ATA bus into a miniature, serial switched fabric. As hot-swappable disk drives are a fundamental requirement for greater data availability, SATA is the latest, low-cost alternative to enable hot swapping. Unlike SATA, the ATA bus was never designed for hot swapping. In fact, it can be argued SATA is a better interface than FC for hot swapping because of the cost of FC and the problems with inserting/deinserting devices on a FC loop.

Low-cost strategy
To start developing a strategy around lower-cost SATA drives, you'll need to become familiar with SATA-based subsystems. Look for enterprise storage features such as multiple SAN ports, multiple RAID levels and hot swapping. Even though the prices may seem ridiculously low, you should still try to get the best deal possible. After all, it wouldn't be a storage purchase without haggling over price. Finally, plan to use your SATA-based subsystems initially for applications with moderate requirements such as Microsoft Exchange. After you get some installations under your belt, you will probably feel like saving yourself big bucks every time you need a few more terabytes.

This was first published in October 2003

Dig deeper on Disk drives

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close