Published: 11 Aug 2004
As a general rule, Serial ATA (SATA) disk drives don't have the same reliability ratings as SCSI or Fibre Channel drives. Whereas an enterprise-class SCSI drive has a mean time between failure (MTBF) of 1,200,000 hours, many SATA drives only run to 600,000 MTBF.
Some vendors, wary of selling unreliable hardware to customers who expect otherwise, have taken steps to eke better reliability out of low-cost drives. For example, earlier this summer Hitachi Data Systems Inc. (HDS) introduced a SATA Intermix option for its Thunder 9500 V family, that allows customers to complement an existing Fibre Channel infrastructure with shelves of SATA drives. Here is a partial list of the features it has implemented to help ward off problems:
- Additional error correction and redundancy checks, in the form of an additional read-after-write mechanism, and a second eight-byte longitudinal redundancy check.
- "Idle sweeps" every 10 minutes, where the system checks the drive for any wear patterns and cleans off any debris. If the drive is idle for more than two hours, the head is unloaded.
- Use of the drive's Self-Monitoring Analysis and Reporting feature to continuously monitor the drive for out-of-spec conditions. Errors are relayed "home" through HDS' Hi-Track phone-home capabilities. Furthermore, if a drive finds 15 or more soft errors in a span of eight hours, the array proactively spares the drive.
- The Thunder 9500 V has plenty of spares--15 per cabinet.
- Prioritized rebuilds. If a drive experiences a hard error, i.e., failure, Thunder can invoke its Fast Rich Copy feature, which increases read speed from eight bytes to 64 bytes.
Because of SATA's lesser reliability, vendors also claim to be taking extra time to test the drives. John Joseph, vice president of marketing at iSCSI array vendor EqualLogic, says that his firm does a four-to-five week "burn in," during which time drives are tested to "exercise the mechanics and weed out the infant mortality." This has brought the annual failure rate (AFR) from EqualLogic's drives down to SCSI-levels: under 1%. The flip side of extensive testing, Joseph says, is that "it does impact the cost structure."
Can failsafes and testing ever be enough for demanding enterprise customers? According to Gartner Inc., approximately 30% of enterprise applications can be adequately serviced by SATA drives. "The question is," says Engenio's Gardner, "which 30%?" Environments that value streaming performance rather than IOPS--for example media and entertainment--may be pleased with SATA systems for primary storage. "But users running heavy IOPS are going to be unhappy and have higher failure rates," he says.
Indeed, recent announcements suggest that you can only take desktop-class SATA drives so far. EqualLogic, whose PeerStorage arrays came out of the shoot with SATA, has announced that it will begin shipping arrays with Western Digital Raptor drives, a 10,000 rpm SATA drive with reliability specs that compare favorably with SCSI drives. Then, next year, EqualLogic will also offer Serial-Attached SCSI (SAS) drives. Some vendors are also considering Seagate's desktop-class Fibre Channel drive.
Where does that leave you? Very confused, laments Engenio's Gardner. "You used to be able to say everything about a drive by referring to its interface." Fibre Channel and SCSI interfaces implied high reliability, while IDE/ATA implied the desktop. Now, "we're in a situation where instead of two tiers of storage, we may get a whole bunch of tiers, and that can get very confusing to customers."