Tiered storage allows an organization to optimize its data storage infrastructure to lower costs, increase performance and address capacity issues. However, an organization should look to optimize its data migration
In this podcast interview Jeff Boles, senior analyst and director, validation services at Hopkinton, Mass.-based Taneja Group, examines some of the business drivers behind storage tiering. He also explores some of the challenges associated with tiered storage, such as data migration, and how effectively addressing these challenges can be the key to tiered storage success.
You can read a transcript of the interview below or download the MP3.
Download for later:
The importance of data migration in storage tiering
• Internet Explorer: Right Click > Save Target As
• Firefox: Right Click > Save Link As
SearchStorage.com: What are the possibilities for getting into tiered storage, and why would an organization want to?
Boles: The concept of tiered storage has been around for awhile, going back to the days of HSM [Hierarchical Storage Management]. But around file storage today, with the runaway capacity requirements and escalating performance demands most of us are facing, it makes a lot of sense to tier storage for efficiency reasons.
If you can get the right storage in the right place, you can tweak the way you manage your storage between tiers and optimize your management resources. For data backup, you might be able to -- if you take fairly stale data and park it off to the side where you have to manage it less often -- optimize your backup processes for primary storage and trim the time and effort you're spending on managing backups. That goes for every dimension of storage management, whether it's capacity or performance management.
Tiered storage can increase your overall efficiency and optimize operational expenditures, but it's also a capital expenditure exercise. If you tier your storage, you can buy lower cost, higher capacity stuff to handle archival data for nearline storage with your lesser-used data. You want to get into tiering to optimize your storage, especially in the face of runaway capacity because there's too much to keep up with if you're not looking for a better approach than just adding more NAS units.
SearchStorage.com: What are some of the challenges associated with tiered storage?
Boles: There are a whole bunch of challenges out there when you're first moving into tiered storage, and some of them are architectural -- how you want to go about implementing tiered storage in your infrastructure. Thinning out different tiers of storage is simple, but figuring out how you're going to migrate data and keep it there for the long term is pretty challenging because there are a bunch of options on the market.
You can apply inline appliances, like F5 Networks Inc.'s Acopia switches. You can apply out-of-band solutions that migrate data in the off hours or with an engine that's out-of-band from your storage. You can choose to use a solution that integrates with vendor-specific APIs, like file policies, and redirects from primary storage to nearline or archival storage. Or you can use things like stub files.
There are a bunch of choices on the market, and a lot of them are architectural. Moreover, those choices involve emerging technologies, like cloud storage or solid-state drives (SSDs), that might be able to cache and offset some of the performance requirements. But it's really a matter of choosing how you're going to get data there, how you're going to access data when it's there and how you're going to manage it for the long term—how you're going to migrate this storage after you've tiered it.
Data migration< comes down to looking at engines with the right features -- whether you want to have an inline solution that sits on your NAS infrastructure and moves data for you, or you want the engine off to the side. Whether you want the engine to be constantly or periodically operating, or whether you want to administrate your involvement with it or use a set of policies to always manage data migration.
The second dimension is how you're going to access the data. Do you want to access it through stub files? There's almost a kind of religious fervor behind the different alternatives out there, and frankly I don't see a whole lot of differentiation between the different approaches. You can make any approach work for you, it's just a matter of your design philosophy, how much maintenance you want to do and how much you worry about specific parts of the solution.
SearchStorage.com: How does an organization's tiered storage goals play into their data migration processes?
Boles: There are lots of different data migration policies and goals out there. So far we've talked about why and how, but as I think about tiered storage goals, I start to think about when to migrate. You need to understand when you're going to move data and whether you're so into the optimization exercise that you want it to be a dynamic, ongoing activity. You also need to understand if that's a fundamental requirement for you or if you prefer a static, interval activity.
Moving beyond that, you need to think about what to optimize. Look at understanding your data, and figuring out how you want to work with the data set, optimize it and tier it. It's not that hard of a question, but it's pretty challenging to look at that data and figure out what's there. You can look at a number of file classification tools to understand file storage, or even just performance for different sets of data. Make some decisions about where you want to put that data based on different parameters, with this idea of optimizing storage efficiency in mind.
There might be some other reasons to optimize storage too, so this is more than just an IT exercise. When you get into this with a big company, you need to look at collaborative names that can represent some of the different business data owners in the enterprise, and look at aligning tiering processes for each type of data with some of the service levels that you plan on delivering back to those business owners.
So you may have low performance requirements with a critical data set, but they might require the level of protection that you're going to apply only to your top tier of storage. This is a pretty complex web and there needs to be a collaborative team process. The efficiencies you can get from tiering your storage are pretty tremendous and can be justified at cross-disciplinary, cross-business owner exercises at most companies.
It is really a matter of getting the right people at the table so you can understand your data beyond what just a classification tool would give you. So there are two dimensions to understanding what they are—understanding what the data actually is and who owns it, and then understanding what it is from the business owners as well.
This was first published in April 2010