Data classification benefits from a team effort, whether the exercise is associated with a storage tiering project to manage resources more effectively, an e-discovery initiative in response to a lawsuit or some other result.
What follow is a list of best practices to help storage managers establish data classification strategies.
1. Identify the sources and types of data across your organization. Before classifying data and figuring out which information should be retained, backed up, archived or deleted, it helps to know the applications the organization uses, the value they represent, the types of data they produce and the access patterns.
2. Establish data categories before making technology decisions. What data constitutes a record, and do records need to be stored separately? Do intellectual property documents need to be grouped together? Determining the criteria by which you need to classify information will have an impact on a host of other decisions, so it pays to nail down categories first.
Also consider how deeply you need to classify the data. Will it be enough to organize it by document owner, creation date, file type or application? Or will the organization derive benefits from classifying data by multiple criteria, keywords, concepts or context?
3. Assess the technology that will best meet data requirements. Organizations doing e-discovery for litigation or regulatory purposes might need the rich classification capabilities of a specialized tool that can search or index content based on contextual meaning or keyword strings.
Those doing storage tiering, on the other hand, might not require such fine-grained data categorization, and decisions might rest on the performance and cost-effectiveness of different storage systems that can fulfill the needs of the different data or application classes.
4. Start small with data classification. Data classification is no small undertaking, so it helps to start with a manageable chunk of information. Tackling finance, legal or human resources as a first step will help to reduce the intimidation factor.
5. Set data retention policies for the different types of data. "There's no way in the world that you're going to want to keep data on your primary storage going on seven years or more. You just can't do it," said Christine Taylor, an analyst at Taneja Group in Hopkinton, Mass. "You've got to move it to long-term archive, whether that's on-site or off. And you can't do that effectively unless you have some method of classifying the data and then moving it properly."
It might also make sense to establish policies based on data type. MP3 or image files, for instance, might be subject to different rules than plain-text documents or email.
6. Audit the products. Business needs change, so it's helpful to do periodic monitoring to see if the data classification work is producing the desired results. This should be done on an annual basis at least or more frequently if feasible.
7. Educate the business on data policies. Business units and employees, in general, should know what the data policies are and how they're enforced, especially if they're expected to classify certain types of data themselves. Some government bodies, for instance, establish policies about what type of email constitutes a record and, thus, needs to be saved. It's important for employees to receive the proper training to be able to save the messages in the proper location.