This just in: Earth knocked off its axis due to weight of 295 exabytes of data! OK, maybe we're just wobbling on our axis a little bit, but that's a heckuva lot of data, and you're going to need an awful lot of disks, chips, tape, paper and anything else that might hold a petabyte here and there to accommodate it all.
That number -- 295 exabytes --was reported in an article in Science Express, a journal published by the American Association for the Advancement of Science. The authors used some pretty complex computations to come up with that number, which they actually define as the amount of data we were able to store in 2007. Science Express looks like a pretty serious pub -- among the other articles in the same issue were "Tomography of Reaction-Diffusion Microemulsions Reveals Three-Dimensional Turing Patterns" and "Dynamic Control of Chiral Space in a Catalytic Asymmetric Reaction Using a Molecular Motor." These folks aren't fooling around . . . and the fact they didn't round the figure off to 300 exabytes adds a little edge of precision that impresses stat junkies like me.
Not only do we have to find a place to put all that data, but we're probably going to have to back it up and then stash away a copy or two for disaster recovery. So that 295 exabytes could turn into a few zettabytes of data. Can yottabytes of data be far behind?
More this just in: According to IDC, in the fourth quarter of 2010, "total disk storage systems capacity shipped reach 5,127 petabytes, growing 55.7% year over year." According to my seventh-grade math, that's approximately 290 exabytes short of what we need, but it's still a lot of disk.
Sooner or later, we're going to have to learn how to throw some of this stuff away. Once the attic and basement are crammed full, and little 0s and 1s are spilling out of the cupboards, we won't have any room for new data. What happens then? Your shop might not be in exabyte territory yet, but a surprising number of companies have crossed the petabyte threshold, and coping with capacity is an ongoing struggle even for shops with far more modest amounts of data to store.
The problem is that data housecleaning tools either don't exist or aren't up to the task at hand. Knowing what can be deep-sixed and what needs to be preserved means you need to know what you have in the first place. Few available products can give you much insight into the state of your data stores. A few years ago, it looked like data classification was poised to catch on—if not as a product category then as an underpinning technology for a raft of storage management chores, like identifying data that belongs in the trash bin. Classification pretty much fizzled out, but maybe it can make a comeback now that we have renewed interest in automated storage tiering.
So what do you do? You can ask your users to clean up their acts by voluntarily deleting all that useless old stuff. Somebody would listen, right? I'll be the first to admit that probably half of what I produce can eventually end up in the data dumpster without any profound effect on humanity, my company or anyone for that matter. You could try data storage quotas that limit what each user can save; quotas can work, but you'll become the second least popular person in your company, right after the guy who's been stealing everyone's lunch from the cafeteria fridge.
Even more this just in: According to CBC News, the Government of Canada is getting very serious about reducing the amount of data it stores: "The federal government has ordered a monster machine to chew up its discarded hard drives, USB thumb drives, CDs, and even ancient Beta videotapes." Why didn't I think of that? It's a perfect solution: a monster machine that eats data. Let's just hope it has a healthy enough appetite to eat 295 exabytes or has some hungry monster friends. But one CBC News reader had another idea: "It would be easier and cheaper just to buy them sledgehammers." That'd work, too.
Or maybe just put everything on solid-state storage. An article in the Journal of Digital Forensics, Security and Law ominously titled "Solid State Drives: The Beginning of the End for Current Practice in Digital Forensic Recovery?" suggests that you don't really have to worry if you're drowning in data, just put it on solid-state devices. At the end of the article's abstract, the two Australian authors wrote: "Our experimental findings demonstrate that solid-state drives (SSDs) have the capacity to destroy evidence catastrophically under their own volition, in the absence of specific instructions to do so from a computer." Good news if you need to put your data on a diet, I guess, but not so good for the solid-state storage industry.
Talk about connecting the dots; stories about oceans of data, skyrocketing disk sales, data munching machines and not-so-solid-state storage, and all in the same week. Maybe it's an omen. Maybe it's time to look for a real solution to soaring data stores (and the associated soaring cost of keeping it all) instead of just throwing more disk, tape or chips at the problem. I know it's counterintuitive for storage vendors to promote technologies that help their customers buy less stuff from them, but maybe there's a wily little startup out there with a great idea and a useful tool that can stem the data tide.
BIO: Rich Castagna (firstname.lastname@example.org) is editorial director of the Storage Media Group.
* Click here for a sneak peek at what's coming up in the May 2011 issue.
- Tiered Storage - Optimizing the Storage Infrastructure –Fujifilm Recording Media USA, Inc.