Analysts tell us that the root problem in IT today is a disconnect between actual business requirements and the services provided by the IT infrastructure. Everyone in IT knows that they don't know much about the content of their systems or how to treat it. Called "governance" in business texts, this alignment between high-level business goals and tactical operations is critical and tremendously lacking in many firms. A conversation on information lifecycle management (ILM) is a way to address this chasm of understanding.
ILM is a conceptual state in which data is stored in accordance to its changing business value. Even though there are serious technical and process gaps for this advanced, aligned storage to be realized, you can begin preparations for its use today.
Conceptually, ILM requires three elements: value, alignment and change. Each element is critical to the overall vision of ILM, and each will prove vexing to address, especially in computer technology.
First, ILM needs objective metrics of data value. This is harder than it sounds—value goes beyond the simple age and access metrics used by hierarchical storage management (HSM) applications. Value metrics have to take into account relevance to the organization's core business, risk of unavailability and other factors that aren't easy to quantify. Many storage resource management products are beginning to include ad hoc, user-entered value metrics, Boolean matching and similar rudimentary systems to calculate value, but this technology is still in its infancy.
ILM also requires the protection of information aligned with its value. Data is normally stored in large chunks (think LUNs) rather than in the fine records to which the metrics are relevant, and it's difficult to move a piece of data even if greater granularity is there. Again, many companies are working on the alignment issue, with virtualization technology providing the ability to move tiny blocks of data from one storage medium to another. In addition, application-integrated software allows stubs to take the place of records in databases and e-mail systems.
The third element of ILM is the idea that the value, and thus the alignment, of a piece of information can change over time. This requires an application-linked system of metrics and movement, which has proved elusive up to this point.
Today's ILM product initiatives generally amount to little more than repackaging old products with new names. But with so many companies investing in ILM technologies—a concept that's fundamentally sound and necessary—it's a foregone conclusion that all data will be managed with ILM in the future. When that will happen is a matter of debate, but the essential concepts are on the way and some basic products are here.
It will be extremely difficult to meet all three ILM requirements for the vast pools of unstructured data that make up most deployed storage today. But all three are available for some applications. E-mail systems, document repositories and databases, for example, contain the essential meta data to feed an ILM system. And new "extender" products, from EMC/Legato, KVS and others, enable the essential mobility element for these systems. If you have an automated e-mail archiving and retrieval system, you have ILM for at least one application.
HSM software can be considered simple-minded ILM for file systems. It typically monitors the only metric available for files—age—and realigns data protection by selectively moving files from one type of storage to another. But HSM has shortcomings in all three areas. First, file creation or the modification date isn't really a valid metric for data value. Another problem is the file-level granularity of HSM; many applications store data elements inside files, but HSM can't reach into a file and move just one record. Finally, moving a file to a different storage system might not change its essential protection attributes—there's more to storage alignment than reducing the cost of the underlying storage system.
Perhaps the most intriguing alignment technology is content-addressable storage (CAS). By breaking the bonds of physical connectivity (think SAN) and file-system layout (think NAS), CAS would theoretically allow an entirely virtualized pool of objects, complete with the ad hoc metrics of value and data movement. While today's CAS systems are fairly basic, vendors are likely to catch on and begin offering CAS networks with tiers of storage, per-record data protection and automated movement on changes.
The biggest hurdle to CAS, however, is beyond the realm of hardware vendors. Applications must be ready to address data as CAS records rather than as files or blocks, and precious few have this ability. But again, there's a technology fix on the horizon. Most operating system vendors have been laboring to build new database-type file systems. But these keep getting pushed to the next generation or remain obscure—such as the Apple Newton's "soup," Windows Cairo OFS, BeOS BFS and Windows Longhorn WinFS. However, it's likely that a database-enabled file system will eventually allow an operating system to interact with a CAS network directly. This will be as big an enabler of ILM as any yet seen, as applications will quickly take advantage of the new technology.
The shortcomings of HSM also highlight the limitations of ILM: For the foreseeable future, most corporate data will continue to lack at least one of the three elements listed here. But all isn't lost: Preparation for ILM will yield many benefits, even if the whole system can't be brought together. Let's start with alignment. Tiered storage can be considered a limited (and manual) form of ILM. Tiering requires grouping corporate data into a few categories and migrating it to different types of storage. It's neither automated nor granular, but tiering storage can yield cost savings and service-level improvements.
Data classification is another hot topic related to ILM. Understanding data requirements is key to the ILM concept, and the process of identifying data, classifying it and developing appropriate protection policies will uncover the true value metrics needed by an automated ILM product.
In short, you have to have a tiered, service-focused infrastructure and a good understanding of data value to enable any ILM product in the future. But how do you get started?
Begin by taking a long hard look at your infrastructure. Are you ready to implement basic ILM-type functionality like "extender" archiving systems or HSM products? Do you have storage tiers that can accept different classes of data? Can you map storage from the arrays back to the application? Do you have an accounting of all of your data sources and a business impact assessment that defines the real value of the applications? All of these issues must be addressed before moving on with any real ILM initiative.
ILM is bigger than IT
There's one other massive limitation to ILM beyond the technical issues. Data management is a far bigger topic than any IT infrastructure group, let alone a storage infrastructure group, can handle. It's bigger, in fact, than IT itself. ILM is nothing less than a new paradigm for business-focused information services.
Think of this as an opportunity to bridge the chasm that divides IT from the business. Ask the business-side folks in your organization if they feel that IT is providing services that map to their needs, and if they're comfortable with the protection you're currently providing for their data. Use this as an opportunity to break the techie bonds that keep most IT professionals in the weeds of technical details.
ILM can be an opportunity to elevate all of IT and to bring it to the corporate table in a way it hasn't been since the 1960s. We're all businesspeople at heart, though our Dilbert mugs and Star Wars posters might suggest otherwise. So begin the conversation about ILM by focusing on what the business needs rather than on how to deliver it. Prepare your foundation with tiered storage, consolidation and a service-oriented approach, and you'll be ready when real ILM products hit the market.