This article can also be found in the Premium Editorial Download "Storage magazine: Five cutting-edge storage technologies."
Download it now to read this article plus other related content.
MTBF "is basically a useless number" from a user's perspective, says Mike Chenery, vice president of advanced product engineering at Fujitsu Computer Products of America, a disk drive manufacturer. "What they really want to know is 'How many of my drives are going to fail?'"
Part of the problem with MTBF is that not everyone understands how it is calculated. An MTBF of 1 million hours, for example, does not mean that a drive will fail in 114 years (24*365*114), as some users may conclude. Rather, it's a statistical metric derived from testing of a large number of drives for a number of days, and determining the mean failure rate for the entire group, explains Aloke Guha, CTO at Copan Systems, a startup that makes a disk-based backup system built with low-cost ATA drives.
The issue is further complicated because some drive vendors "tend to play games with the numbers," says Guha. For example, two drives may both have MTBFs of 500,000 hours, but one may be rated for a 24/7/365 duty cycle, but the other for an 8-hour duty cycle. "Read the fine print," he warns.
Unlike MTBF, another metric called annualized failure rate (AFR) will give you a sense of how often
In this day and age of very large disk drives, it's good to be able to predict your drive failure rates, says Dick Benton, senior consultant with Glasshouse Technologies in Framingham, MA. That's because the larger a disk drive in a RAID set, the longer it will take to rebuild, the longer you are exposed to another drive failure, or total data loss.
The problem with AFR is that vendors don't generally publish it. But Copan's Guha, for one, thinks that "the time may have come for users to demand 'What is the reported AFR?'"
This was first published in October 2004