A step-by-step approach to data classification


This article can also be found in the Premium Editorial Download "Storage magazine: Five companies on their storage virtualization projects."

Download it now to read this article plus other related content.

Engage business users
The most common shortcoming of a data classification project is the perception that it can be completed through technical analysis at the storage layer without engaging business users. While discovering and analyzing storage is part of the process, good classification requires engaging business users or their representatives in IT. While there are a number of tools that can greatly assist in data classification through discovering, searching, mapping

Requires Free Membership to View

and summarizing, projects that don't engage business users--IT application owners, legal groups and compliance officers--tend to get bogged down.

Four steps to data classification
While not all data classification projects examine the same data types, have the same scope or are used to create the same strategy, a successful approach will involve four common steps. Breaking the project into these steps--and completing each one--will ensure your success and lower your stress levels.

Step 1: Choose your target

  • Define your goal. Are you trying to resolve a specific pain point, reduce costs, create service levels or ensure compliance? Having a desired outcome will ensure that you collect the right information.
  • Determine the project scope. Should you classify a subset of your enterprise or is the whole environment achievable? Do you focus on a single app or file system, a department or the entire data center?
  • Set the level of classification: application data sets, file systems, files, business objects or messages.
Step 2: Map an approach and appropriate toolset
  • Determine the metrics you'll collect. This is the data that will drive your strategy and enable you to think outside the technology stack all the way to the end users in your business groups (see "Engage business users," at right).
  • Define your data sources. These include existing classification and tiering studies, spreadsheets, resource management reports, organization charts, legal policies, DR plans, etc.
  • Determine which new or existing tools you need. This includes storage vendor tools, discovery engines, ad hoc scripts and database queries. (An upcoming article in this series on data classification will describe some of the tools that can make the process easier and more automated.)
  • Determine the percentage of completion needed to set a strategy. In general, the last 20% of information won't be worth the effort. Can you live with a 60% sampling and still accomplish your goals? Check your progress periodically and decide whether it makes sense to continue collecting data.

This was first published in August 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: