This article can also be found in the Premium Editorial Download "Storage magazine: Storage salary survey: Are you being paid enough?."
Download it now to read this article plus other related content.
|How to proceed with ILM|
First, let's clear the air: ILM is not a product--it's a process that manages data from its birth to disposition. Once your company has an ILM process, then and only then, can you start to cobble together products that support the process. (See "How to proceed with ILM")
Storage vendors are scrambling to beef up their alliances and product lines to offer an integrated ILM solution. Storage Technology Corp. (StorageTek), for example, forged a partnership with Storability Software to more tightly integrate Storability's Global Storage Manager into StorageTek disk and tape drives. EMC Corp. acquired Legato Systems and Documentum to bolster its ILM offerings--and now the company faces the huge task of weaving together the technology it's acquired into its own solutions. Companies such as IBM Corp. and Hewlett-Packard Co. (HP) find themselves in positions similar to EMC, with a stable of disparate internal and acquired solutions that have limited or no integration between them. Veritas Corp. jumped on the ILM bandwagon when it released its Data Lifecycle Manager 5.0, a policy-driven data archive engine that works with Veritas' NetBackup and Backup Exec.
To date, no vendor offers a cradle-to-grave ILM product that easily solves all of an enterprise's compliance needs. Chris Van Wagoner, CommVault Systems' vice president of product marketing, says that the level of integration among most vendors' products is one step above brochure level.
Turning data into information
Steven Murphy, Fujitsu Softek's president and CEO, believes that the key to turning an organization's raw data into viable business information starts with gaining visibility into one's existing environment, in two ways. First is the picture of the different types of data stored on internal and external storage, including databases, files, e-mail and fixed-content storage. The second view gives you the ability to visualize, monitor and manage the devices in their existing storage networking infrastructure.
Products that support either of these two views typically get classified under the general category of storage resource management (SRM), even though the two types of tools gather and provide very different types of information. Tools that provide details on databases, files, e-mail and fixed content may more appropriately fall under the subcategory of storage reporting. The other tools, which visualize, monitor and manage the storage network devices--such as switches and storage arrays--may be more appropriately classified as storage infrastructure management software.
Of these two approaches, it's the tools that support the storage reporting component that initiate ILM. When they begin to classify and document their data, organizations begin the process of translating raw data into meaningful information.
Classifying the information
Reports on storage utilization enable the classification of the data on servers. The Enterprise Storage Group, Milford, MA, finds that data may be classified in at least four ways: data type, organization, data age and data value. (See "Classifying the data," on this page.) Breaking data out into these different categories helps educate the organization on the nature of the data it owns and provides the facts needed to build business cases for future storage and data automation management technologies. Here's where the tool's ability to recognize different data types becomes paramount.
Users must have some fundamental understanding of the application data that they are gathering information on before they even begin to deploy a tool. Without this understanding, don't expect to deploy a tool enterprisewide and gather all of the data you need.
EMC hopes to help users solve one of their bigger issues--classifying unstructured data--through the company's recent acquisition of Documentum. Mark Lewis, EMC's executive vice president of open software, observes that 80% of existing data is unstructured and 90% of newly created data is digital. Yet until users classify their data, they can't take the next steps of protecting it, applying policies, moving it or deleting it.
This data classification step can unveil some important facts and save money at the same time. A recent case study conducted by DeepFile Corp. at Vignette Corp.--both located in Austin, TX--uncovered two important facts. It revealed that more than 50% of the files on Vignette's most expensive storage devices hadn't been accessed in more than a year and that a large amount of storage was being consumed because of file and directory duplication. Armed with this knowledge, Vignette purchased an inexpensive nearline ATA storage array and moved its older files to the lower-cost disk.
Another key feature for a storage reporting tool is its ability to interact with devices in the storage network. For instance, in networked storage environments, being able to do a logical-to-physical mapping to find out which physical storage device the data actually resides upon is extremely helpful. EMC users should take a serious look at EMC's Storage Scope, which reports how each server's files are mapped and laid out. Unfortunately, you have to spend some time configuring the tool, and the tool itself is particular about the environment it's deployed in.
Users also need to understand what sort of data they want to report on and manage. If users just want a more generalized storage utilization reporting tool, almost any SRM tool will do. Yet for those who want more advanced storage reporting capabilities--such as the ability to report on instant messaging files, data in tape silos or even existing paper files--they will likely need different tools to get reports on all these types of data. For instance, Storability's GSM tool will give you detailed reports on the utilization of the tapes in StorageTek's tape libraries. But, if you're using that tool to monitor tape libraries from IBM, don't expect the same level of detail.
Another item users will want to consider is who will control the verification and distribution of the information after it's gathered. Finding out that you have 10TB of unused EMC Symmetrix or HDS Lightning storage may sound great until the financial guys figure out that someone overspent to the tune of $1 million.
This was first published in December 2003