This article can also be found in the Premium Editorial Download "Storage magazine: R.I.P. RAID?."
Download it now to read this article plus other related content.
The world of file content and NAS storage is disjointed and threatening; we need to unravel the problem of massive files stores before the issue gets too big to handle.
Let's face it: The big problem with file content is users. People create, copy, convert, forward, edit, scan and download files all day long. It's the Wild West of storage without many controls or restrictions. I remember one customer who discovered they had 125 copies of a scanned Chinese menu on their tier 1 storage system. Wild . . .
Look inside any company and consider the hundreds, thousands or tens of thousands of individuals creating -- and recreating -- content, and it's not hard to see how easily file sprawl becomes a pervasive and very big problem. More and more companies have hundreds of terabytes or even petabytes of file storage. In many cases, storage managers have no idea how much file content they have, the value of that content, how much it's costing them, where it's being stored or how it's being protected.
We're not only creating tons of files, we're creating huge files in the form of images, video and audio content. So, lots and lots of files, including some truly big files, add up to the essentially unchecked consumption of expensive and hard-to-manage IT infrastructure.
This brings me to the next big problem with file content: How we store it. A great deal of file content gets parked on NAS storage systems and although
I've been talking with some big NAS shops lately, and one of their biggest challenges is NAS migration. Companies with hundreds of terabytes or petabytes of NAS file content feel like they're essentially tethered to specific NAS devices because the complexity of moving that data is often perceived as an insurmountable challenge or at least far more trouble than it's worth. One user told me he felt he was being perpetually held for ransom by his NAS storage.
Is unstructured another word for useless?
We often refer to files as unstructured data. By its very nature there's a lack of a defining structure to this type of content, so it can be hard for IT professionals to clearly classify the usefulness of file data. However, we don't dare delete it because there's always the risk that it will be needed some day; for most companies, the cost of avoiding that risk is perceived as less than the capital cost of buying the gear to store all that data.
Interestingly, industry studies have found that 60% to 80% of unstructured content is never used again 90 days after its creation. That statistic alone makes unstructured content seem synonymous with "useless" content. It costs so much to store and protect file content, so why not use it? Is it because the content has no sustainable value, or is it because we just don't have the tools to easily and effectively make use of it?
Backup gets even harder
I believe the biggest challenge in a petabyte world is backup. Consider our new storage landscape with those hundreds of terabytes or petabytes of file content being stored on multiple storage systems.
Now ask yourself: How do you protect all of that file content? Then think about how much that protection will cost you, not just in dollars, but in time and resources, too. Legacy methods or sticking with the status quo are insufficient ways to meet the needs of today's requirements. This means either a new method of file protection is required or you're just rolling the dice when it comes to recovering data. The latter choice is a hard one to make, especially when you consider that the consequences of a failed recovery could permanently damage your business. This is one of the biggest issues our data centers will confront this decade.
For the longest time we've been able to get by doing business as usual and solving -- or forestalling -- the problem by throwing more IT infrastructure and people at it. But now we're at an inflection point where we can no longer be complacent with the status quo. Managing massive file stores is one of the "big" problems in the data center for the decade, and IT professionals need to sound the alarm and make this a real priority.
BIO: Tony Asaro is senior analyst and founder of Voices of IT.
This was first published in May 2010