Tracking down those missing bytes

This article can also be found in the Premium Editorial Download: Storage magazine: Hot storage technology for 2008:

A reader of Storage magazine recently wrote to say he had purchased an Imation Odyssey with 80GB and 160GB cartridges (removable hard disk drives) to back up one of his computers. He was surprised to find several gigabytes of capacity missing. Then he spoke with a storage architect who told him he discovered only 41TB of storage space existed on a new enterprise array that was expected to deliver 45TB.

It's a common problem and one that prompts the question: Who took a byte, megabyte, gigabyte or terabyte out of my storage, and where did it go? Should I blame the disk vendors, or the server and software vendors?

There are two main reasons for the discrepancy between what's advertised and what you get. One has to do with rounding up numbers and the other with how the storage is configured.

Disk drive manufacturers use base 10 (decimal) to count bytes of data, while memory chip, server and operating system vendors typically use base 2 (binary) to count bytes of data. This can lead to confusion when comparing a disk drive base 10GB with a chip memory base 2GB of memory capacity, for example, 1,000,000,000 (109) bytes vs. 1,073,741,824 (230) bytes.

Moving forward, a new nomenclature based on the International System of Units will use MiB, GiB and TiB to denote million, billion and trillion bytes for base 2 numbering. For base 10 numbering, it will be MB, GB and TB, respectively.

Here's a tip: Look at the number of total 512 byte sectors available for the disk drive device or storage system as an indicator of a storage device's actual raw capacity. Some storage systems use 520 byte or 528 byte low-level sectors for data consistency, yet report usable sectors as 512 bytes; thus you may see a lower total raw capacity for the device. Most vendors document how many bytes, sometimes both in base 2 and base 10, as well as the number of 512 byte sectors supported on their storage devices and storage systems (though it might be in the small print).

Base 2 and base 10 numbering account for only part of the missing storage capacity compared to what you expect to see. Rounding up or down can mean the difference between a 146GB and 147GB disk drive, for example. A storage system's internal overhead space needs (snapshots, replication disk-based buffers, storage system software, cache memory de-stage or scram space), as well as RAID levels, spare drives, and operating system or file-system formatting will impact your actual storage space.

When adding storage capacity, be clear with vendors so that all parties understand what the other is talking about: Is the discussion about raw or usable (formatted, RAID level, file-system and storage system overhead) capacity, and what unit of measure (base 2 or base 10) will be used.

--Greg Schulz

This was first published in December 2007

Dig Deeper on Storage Resources



Find more PRO+ content and other member only offers, here.



Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: