Having all the space you need for storage is great, but how do you actually find something when you need it?
I still have a football from my last game as a stellar 12-year-old quarterback. (Unfortunately, my mental aptitude continued to flourish while my physical size and abilities went straight into retirement.) Don't ask me to produce the ball; I know it's in the attic, but I'll never find it.
Think of the giant reams of unstructured data you have as your attic. Can you find your football? I doubt it.
We can shove hundreds of terabytes that used to take up more space than the state of Rhode Island into something the size of a four-foot box. Now we have something the size of the Grand Canyon, and we've filled it up with stuff. While we know everything is in the Grand Canyon, it's mostly useless because we can't actually find anything we need. Hey, where's that check I wrote to Storage Swinger Review?
We've placed more value on getting stuff in than on getting it out, and that's going to hurt us. Finding something is harder than storing it -- like finding my football in my attic. That's why intelligent search and indexing (ISI) will be the hot topic of conversation now that we've solved the "giant storage, really cheap" issue. Finding things that are relevant when they're needed is how you derive value from an existing data asset.
ISI is how we're going to drive value out of our data post-original usage. It's also how information lifecycle management (ILM) or data protection lifecycle management (DPLM) (ILM – DPLM for backup data, archived data) becomes functionally usable.
To make an xLM environment useful, you first need to classify data. You can do this manually (most do), but that leaves you with the same problem the next time someone comes up with an even cheaper storage solution. It's better to do this with a combination of manual assessment and automated enforcement, a la tools from companies like Arkivio or Seven Ten Storage Software. Creating rules is a good idea, but making sure they're followed is even better. I hear firms like Index Engines are doing cool stuff in this area, but I don't know anything about them.
Most of the players deal with "new" data -- they ingest the data to act on it. I like what Kazeon is up to, as it can deal with the new data and legacy data that's out there. It can help you to figure out what the heck is in the Grand Canyon and will then create a catalog/index so you can find things that might be useful. I expect we'll see a lot of entrants in this space in the near future.
Search, indexing, cataloging, etc. could be the most important storage/ data management category in years. I see products taking the best of storage resource management (figuring out what's out there, but in specific terms: what file, who wrote it, etc.), and combining it with where stuff is and the ability to look inside specific content to create usable menus of everything.
The bottom line is that I don't know what I'm going to want to know later on. I only know that I'm going to want to know something and no matter how big that disk is, it's still too dumb to tell me.
Join us in the wild and whacky world that is Steve Duplessie's view on storage. Each month we'll add a new Steve Duplessie blog that will not only keep you up to date with the fast-paced storage market , but entertain you as well.
This column by Mr. Duplessie first appeared in Storage magazine's September 2005 issue.
About the author:
Steve Duplessie is the founder and senior analyst for Enterprise Strategy Group in Milford, Mass.