I was talking with a friend the other day about the prospect of multi-terabyte hard drives and how painful it would be to lose that much data. My friend — being my friend of course — countered that it’s not the amount of data, but where it resides and what the data is that’s important.
For instance, he went on, the EEPROM on your desktop motherboard isn’t more than 2MB worth of data. Yet without it the bazillion hours of work you have stored on your desktop hard drive, while safe and sound, is still useless to you because you can’t access it because your computer won’t boot.
After conceding the point, I rephrased the statement to emphasize the loss of multiple terabytes of data residing on a platter-based spinning medium, located in a computer or computer-like device providing data storage services to said computer, group of computers, or computer-like devices (whew!).
Without blinking an eye, he said he’d started a hard drive data recovery company. He built a clean room and had been perfecting his recovery skills on hard drives purchased on, all of places, eBay. As an aside, use a hammer and nail, or Sawzall, to properly delete all data from unwanted hard drives you dispose of.
A while back, I got a frantic call from a family member whose laptop hard drive had crashed. She was beside herself because on her hard drive were all the digital photos she’d ever taken. . .ALL of them. She’d meant to back up her stuff to a disk but never got around to it. She wanted to know was there anything I could do to help her.
That is when it hit me full force, I have brilliant and baleful friends.
My friend recovered almost all the data from her hard drive for me (at a very reasonable price) and now she has the first pictures of her child, some of her wedding photos and other very important moments in her life back, and on DVD this time. The whole saga got me thinking: Am I really protected from a hard drive crash? How about the executives I support? What would I do if my array at home failed where I have all of my photos!
Seeing the look on my relative’s face when I presented her with all of her photos was priceless. But it got me thinking about all the other people out there in the SMB world with the 0.5 person IT shop who don’t even know these services exist, much less who can afford the super-high cost of traditional data recovery. I don’t think today’s data protection schemes are going to be able to handle the eventuality of these super-sized drives making their way to the same SMB shops.
Do the math. A decent 100Mb pipe can push about 3TB an hour (this takes into account -25% for packet and transmission overhead). If you had three people with a terabyte drive, you’d saturate a 100MB uplink should they decide to back up to a device on the network. How are we going to back that up? The storage SaaS startups making their way to market aren’t going to be able to keep up either. Imagine backing up 400-700GB over your home Internet link where your upstream bandwidth is only 768Kbps.
I saw this coming a bit back when I got my grubby hands on the Hitachi Terabyte drive and have begun using a combination of VMware Player and VMware Workstation to mitigate my issues with capacious storage at home. I essentially virtualize the machine I want to use and deploy that on top of a generic OS install, replete with a pretty icon (in my case, Debian Linux), instructing the user to launch the player as their “desktop.” I’ll eventually get to a point where I will move upward from Player to Workstation for all my machines (right now cost is limiting me to using player for most of my machines), then run snapshots and back up the snaps to the same location as the original VMDK using RSync.
It sounds like a lot of work, but try explaining to your wife that she’s lost all her projects she’s been working on and you don’t have a recent backup because her drive is too big to back up quickly. You’ll appreciate the effort that much more when you can say, “I’ve got you covered, hon!!”
Here’s the visual I use when I explain this concept.
1) Fold a piece of paper four times (or use a folded napkin)
1a) Imagine the paper (napkin) as your physical hard drive
2) Tear off two or three 1-inch pieces of that napkin. Put them on the table next to the napkin.
2a) Imagine those pieces as virtual hard drives or volumes.
3) Reorder those 1-inch pieces of the napkin. Easy, isn’t it?
4) Peel apart the layers of those 1-inch pieces, 4x as much stuff to manipulate, making it take a little longer to move things around the table, no?
4a) Imagine those layers as individual files.
Take this one step further. Blow a soft puff of air at the three 1-inch pieces before you peel them apart (this works best with the napkin as they are slightly “stuck” together). Think of that puff of air as a failure or some sort of issue with storage. Do the same when you’ve peeled apart the pieces.
Now you have a great way to envision how your task of managing individual files (family photos) on a gargantuan hard drive (look how much napkin you have left!!). Multiply that out by a couple of napkins and you see why all of a sudden this problem of failed drives and how to protect against it becomes really hard in the TB-drive world. This can open eyes at the management level. It puts a real and appropriate understanding of why we as storage admins freak out at times when they refuse to allocate budget.
I started out talking about the advent of huge drives and what are you going to do to get the data back should they fail? I’ve developed my own solution to protect myself using some free and not-so-free tools from VMware, but I’m not sure it would scale well, or be easily manageable. Maybe a small challenge to the hardcore virtualizers out there may be in order. . . .