This article can also be found in the Premium Editorial Download "Storage magazine: Big 3 backup apps adapt to disk."
Download it now to read this article plus other related content.
Before buying technologies to improve data management, organizations should determine what data it creates and formulate general information groups. Criteria should then be developed to identify what data belongs to what groups. Finally, organizations should establish policies (and the associated enforcement) and management actions related to the data groups. These groups are information categories that take into account external influences such as retention and privacy regulations, information security risks and accessibility requirements. For example, the criteria for a "confidential financial information" category may be any Excel spreadsheet created by a senior executive or a finance department staff member. A policy for the category may establish a retention period of three years, with an associated action that says the file should be archived to immutable storage. Determining information categories, criteria, rules and actions is largely a manual process that should involve IT and the internal groups that establish corporate policies regarding information access and privacy, as well as regulatory compliance.
The second step in the information preparation process is data analysis. Data is scanned for criteria established during categorization. All sources of data, as well as new and historical data, can be scanned. Because this involves a vast amount of data, data analysis needs to be automated. This is the most crucial part
The culmination of the categorization and data analysis is information classification, which Enterprise Storage Group considers a defined market segment. Information classification products, such as those from Abrevity, Fast Search & Transfer, Kazeon Systems, Scentric and StoredIQ, perform the analysis and automatically add attributes to data when analysis results identify criteria matches with the information categories. The new attributes include the policies and associated management actions that should be taken with the data. For example, all of the file servers are scanned and analyzed, and all of the Excel files created by finance employees are tagged with a retention period and destined for immutable storage. The classification solution passes the data and its attributes to an information management application that's responsible for enforcing the retention and storage policies.
This was first published in April 2006