One of the great, and as yet unfulfilled, storage needs is automatically moving data to different tiers of storage...
between different vendor products. How close is the industry to achieving this goal? "I don't think I'm supposed to talk about SMI-S [Storage Management Initiative Specification] release dates, so I'll just say that it's being worked on," says Edgar St. Pierre, co-chair of the Storage Networking Industry Association's (SNIA) ILM Technical Workgroup. Other knowledgeable industry observers say the necessary additions to SMI-S to enable automated data tiering will probably be released in one to two years, and products based on those standards will follow soon after.
SNIA is working to define three areas as standards: data classification service (to identify the data's requirements for service), data service-level management (to map data requirements to the capabilities of the different storage tiers) and data lifecycle management (because storage requirements for data may change over time as the data's value changes). Value, of course, is a fuzzy, relative concept specific to each organization, not something easily encapsulated into an algorithm or standard. "It's not a single number," says St. Pierre.
Mainframe data is commonly moved among different storage tiers by various hierarchical storage management (HSM) programs. However, HSM data-movement policies are based solely on file attributes such as file owner, location, creation date or date of last access. This type of policy may be sufficient in some cases, but it isn't necessarily related to the data's changing business value.
Another big data moving problem is the type of data and the characteristics of the application that uses that data, says James Damoulakis, CTO at GlassHouse Technologies in Framingham, MA. For example, relocating data in a database can't be done without understanding the business logic of the application and the dependencies that may exist among various tables within the database. One reason email archiving is extremely popular now, says Damoulakis, is that email objects can be categorized and managed relatively easily.
The next tiering question that needs to be standardized is where the data migration policies should reside. "In the application," says Damoulakis, "[because] the application is the only entity that understands the key business interrelationships of various data elements." David A. Deming disagrees: "In the array," says the founder, president and CTO at Solution Technology in Ben Lomond, CA. He adds that the data movement policy engine frequently resides in the storage fabric or dedicated application server.
In the future, says Deming, organizations will deploy separate SANs: one dedicated to backup and replicated data; and another solely for use by production applications, "like an iSCSI SAN that is separate from the production IP network."
For automated tiered storage to work across the entire data center, companies need systems that store data based on a unique, meaningful name and have meta data stored separately from the data's storage address. That way, data can move to any storage address and the applications can still find it. Technologies like Microsoft's WinFS file system, which incorporate database-type functionality into file systems, may help to solve this problem when they're released.
Every storage technologist contacted for this article agreed that there are separate point solutions for moving, classifying and searching data, and that these tools should be combined into a centralized data-moving policy engine. Until those standards are released, says Damoulakis, "pick the low-hanging fruit [email archiving, selective relocation of file data] and go through the manual effort of ensuring that applications are relocated to appropriate [storage] tiers."