Sergej Khackimullin - Fotolia
Computer users have a hard time saying goodbye. Not to each other, or to you or me, but to the data they create, receive or just stumble upon. They like to keep everything because letting go is just so (sniff!) wrenching. Whether it's a Word doc; spreadsheet; or a .gif, .jif or .dif file, bidding adieu to data is too much to bear for many users.
But while your users spare themselves tearful ta-tas, it's your storage shop that has to shoulder the burden of the massive amounts of data that keep on growing and growing and growing.
A recent publication from Veritas -- aptly titled "The Data Hoarding Report" -- cited a recent survey sponsored by the vendor that reveals 62% of office professionals admit that they are (gasp!) data hoarders. The IT pros ostensibly responsible for managing that data are even guiltier, however, as 81% are self-confessed data accumulators as well.
The report went on to note the issue with data storage capacity isn't just where to put all this stuff -- it's more of a case of why do we have all this stuff, because 86% of the data companies store is "Redundant, Obsolete [or] Trivial." That conclusion also offers up one of the best -- and most accurate -- acronyms we've seen in a long time: ROT.
The report notes that storing and managing all that ROT ends up costing businesses billions and billions of dollars. Whether or not you buy the report's nearly $1 trillion cost estimate, you've got to figure ROT, whatever the cost, is a black hole for storage budgets. And while data storage capacity has gotten pretty cheap -- and is bound to get even more affordable -- it still costs something.
Data storage capacity, meanwhile, is really only one part of the equation. In addition to providing a home for voluminous data stores, you have to back it up and then create some copies of those backups to tuck away here and there for safekeeping. You'll also have to manage all of that to ensure copies are made properly and everything is up to date and in sync. For many companies, the amount of data requiring this kind of care has simply overrun the IT staff's ability to handle it all within a day that stubbornly still lasts only 24 hours.
Data hoarders 'r' us
We're up to our ears in ROT because of knee-jerk reactions to compliance and an enthusiastic embrace of big data. When the likes of the Health Insurance Portability and Accountability Act, Sarbanes-Oxley, and a whole new generation of Securities and Exchange Commission and financial rules first raised compliance consciousness, there were essentially two camps. One said "delete everything" to avoid uncomfortable situations like smoking gun email trails that put your company in litigation hot water. The other said that approach was shortsighted, so instead we should keep everything -- you know, just in case.
While the latter camp was gaining converts, and we amassed more and more data, along came the allure of big data analysis. Big data is turning all of us into data hoarders. The whole premise of big data analysis is that even seemingly useless data, when matched up in some creative way with other bits of data, can yield useful information that will help your company squash the competition and achieve world dominance.
Unfortunately, a lot of that ROT will always be ROT, regardless of how cleverly it's twisted and turned and manipulated. But merely the possibility of unearthing hidden gems is enough to inspire most companies to keep every bit and byte they create.
Time for an intervention
Now, what if we had a way to separate the ROT from truly useful data? It would be nice if vendors would look up from feverishly writing SAN and NAS orders long enough to see that most storage shops desperately need help managing information. Seems like I go on this rant every year, but there are so few developments in this space that the issue just doesn't go away.
A handful of companies are really trying to address the out-of-control data storage capacity issue. Data Gravity, still really a startup, is practically a poster child for the better data management movement. As is Actifio, which pioneered the concept of actually not creating a zillion copies of data, but rather just spin off one or two duplicates at a time and then create others as needed.
With its capability of storing detailed user-generated metadata, object storage still promises to be one of the key building blocks of a truly manageable, data-centric storage environment. Caringo's FileFly app takes a big step in that direction by allowing storage managers to create policies embedded in metadata that describe the disposition of each piece of data. I know there are other vendors doing good things in this area, but there aren't enough of them, and most of the big iron storage vendors are still basically ignoring the issue.
Users need help determining what to keep and what to ditch. The Veritas report listed the top reasons why users are data hoarders: 47% are afraid they'd delete something they might need later, and 43% simply don't know what to delete or keep.
Storage pros can help users make those decisions, but they need the tools to turn those data storage capacity decisions into constructive action.
Optimize your capacity for data storage
Reduce capacity demand for data storage
Buy more storage capacity than you need
- Best practices for effective information management –SearchDataManagement
- Rethink data integration for the age of big data –SearchDataManagement
- The best way to begin an enterprise information management program –SearchDataManagement
- Big Data Challenges and Pitfalls –SearchDataManagement