Home > Storage Technology News > Data classification: An overview
Storage Technology News:
EMAIL THIS

Data classification: An overview

By Stephen J. Bigelow
02 Nov 2005 | SearchStorage.com

Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   

An enterprise must grapple with many different kinds of data, including financial reports, personnel files, research results and customer records. The problem is that most companies do not correlate information with their business process, often resulting in a poor utilization of storage resources. Even worse, the business may not be positioned to address compliance audits or legal discovery challenges simply because they just can't find the data in question, placing the company in an extremely vulnerable position. As a result, companies often waste valuable storage resources by retaining all data on expensive, high-performance systems. The practice of data classification seeks to overcome this potential weakness by aligning information with business needs, categorizing the data based on these needs and then using the resulting classifications as a roadmap for retaining and storing information. This is a fundamental underpinning of information lifecycle management. "Data classification is a methodology to align business requirements to infrastructure, so that infrastructure service delivery properly supports data storage and management," says John Merryman, senior consultant with GlassHouse Technologies Inc. Greg Schulz, senior analyst at Evaluator Group puts it even more simply. "Data classification organizes data so that IT can manage it."

Understanding is a prerequisite to success

A successful implementation often requires a solid understanding of the needs for data classification in the first place. Data classification is a serious endeavor, so companies must first address the question of "why do it?" The driving factors usually involve risk mitigation. For example, some companies may be concerned about meeting compliance audits such as the Health Insurance Portability and Accountability Act, the Sarbanes-Oxley Act or other forms of corporate governance (e.g., a life science company may be concerned with particular test data for the Food and Drug Administration). Other companies may wish to ensure adequate response times in the face of legal discovery challenges. Still other companies may focus on more tangible objectives such as increasing the availability of important information for end users, improving the responsiveness of storage resources or saving money by shifting less critical information to secondary storage solutions (a.k.a. tiered storage).

Once an organization understands "why" data classification needs to be implemented, the real work of classification can begin. The classification process can be long and involved (depending on the size and scope of each organization), but classification is largely a manual process. While there is software available to help discover and evaluate information assets, there is no practical tool that can tell you what information is worth to your business. Each company must derive that answer itself. "It's a very high-tech concept that starts in the very lowest tech imaginable," says Steve Duplessie, senior analyst at the Enterprise Strategy Group. "It starts with a piece of paper and a whiteboard, and two people having a discussion." Eventually, this collaborative effort extends to every key area of the company. Michael Peterson, program director of the Storage Networking Industry Association's Data Management Forum, says that it's really a team effort. "The team usually consists of IT, information management, information security, finance, business and legal," he says. Additional corporate departments may also become involved at some point during the classification process. Peterson notes that data is often classified by application, by company group -- such as finance or manufacturing -- by meta data or by type, though the actual categorizations depend on the specific needs of each particular business.

Ultimately, the trick is to establish a manageable set of classifications that can suit your entire organization. Perhaps the most tangible advice shared among analysts and vendors is to approach data classification in small pieces. Peterson suggests starting with specific data types, such as backups or e-mail. "Start building out little islands. Get practice making the system work and getting policies in place," Peterson says. The next hurdle is to build interest within other areas of the organization. "You need to show that you're succeeding. You need to have some important wins along the way, so the very first place that you can see some wins is by removing cost." He cites the move to tiered storage as one major cost savings that companies can easily measure. If the concepts of data classification are simply overwhelming your organization, turn to consultants and professional services to help jumpstart your process. Again, they cannot determine your data's true value, but they can ask the meaningful questions that will get your effort started.

No substitute for the human touch

Unlike many emerging IT developments, data classification is almost entirely a human decision-making exercise. "Tools can help in the automation and enforcement of classification policies, but 'they' don't classify -- you can't eliminate the human thought process," Duplessie says. While there is certainly software and hardware products available to help discover your data within the enterprise, determine its location, set policies on that data and measure the adherence to those policies, no product has the intelligence needed to determine the "value" of your corporate data. No software can possibly know that loosing a certain file may result in an indictment of the chief financial officer.

Still, the tools are evolving to supplement and automate data classification, and related tasks -- such as data migration or retention -- though such tools perform very limited and specific functions. "Those [tools] are typically server-based applications that have an interface and pulls meta data from client environments -- not unlike a backup or storage resource management product, only with more detail about the data," Merryman says. But such products can only work with data based on definitions derived during classification discussions. "It's all about understanding the organization and its requirements, and the only way to really get that is through 'people' and through 'process'," he says. See "The vendors" later in this article for more details on emerging software products. No specific hardware devices are needed to support data classification at this time, but storage subsystems can add an indirect benefit. As one example, a tiered storage system may indirectly support data classification by influencing the cost of storage -- allowing less valuable data to reside on less expensive disks.

Go to the next part of this article: Data classification: Strengths and weaknesses

Or skip to the section of interest:

  • Introduction
  • Data classification: An overview
  • Data classification: Strengths and weaknesses
  • Data classification: The vendors
  • Data classification: User perspectives
  • Data classification: Future directions


    Tags: Data storage compliance and archivingData management toolsTiered storage data classificationsVIEW ALL TAGS

    Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


    RELATED CONTENT
    Data storage compliance and archiving
    Choosing a storage system for data archiving
    Mimosa Systems adds case management tool to NearPoint 4.0 data archiving software
    Mimosa NearPoint, LiveOffice Mail Archive offer hybrid SaaS email archiving approach
    HP resizes its ExDS9100 scale-out NAS system; finds market broader than original Web 2.0 target
    New data archiving products focus on software-only delivery, cloud integration
    Email archiving strategies: Five best practices
    Email archiving needs soar as e-discovery requests rise
    Storage Decisions Chicago 2009 Session Downloads
    Storage Decisions Session Downloads: Data Retention & Retrieval Track (Chicago 2009)
    Storage Decisions Session Downloads: Storage Systems & Storage Management Track (Chicago 2009)
    Data storage compliance and archiving Research

    Data management tools
    Choosing a storage system for data archiving
    Podcast: Integrating reporting tools into your storage infrastructure
    Green storage best practices control costs, increase energy efficiency
    Tiered storage, data reduction technologies manage capacity growth for companies as IT budgets shrink
    Hitachi Data Systems (HDS) expands thin provisioning with Storage Reclamation Service and Hitachi Dynamic Provisioning
    Best practices for effective thin provisioning
    Storage Decisions Chicago 2009 Session Downloads
    Storage Decisions Session Downloads: Data Retention & Retrieval Track (Chicago 2009)
    Storage Decisions Session Downloads: Storage Systems & Storage Management Track (Chicago 2009)
    IBM writes software to improve data placement on solid-state drives
    Data management tools Research

    Tiered storage data classifications
    Data classification is end users' job
    Data classification: Getting started

    RELATED GLOSSARY TERMS
    Terms from Whatis.com − the technology online dictionary
    litigation hold  (SearchStorage.com)

    RELATED RESOURCES
    2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
    Search Bitpipe.com for the latest white papers and business webcasts
    Whatis.com, the online computer dictionary



  • Backup Solution Directory and Archiving Reseller Resources
    TechTarget Storage Media
    Storage Magazine View this month\\'s issue and subscribe today.
    Storage Decisions Apply online for free conference admission.
    SearchStorage.com
    HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

    About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
    TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations' technology projects - with its network of technology-specific websites, events and online magazines.

    TechTarget Corporate Web Site  |  Media Kits  |  Site Map




    All Rights Reserved, Copyright 2000 - 2009, TechTarget | Read our Privacy Policy
      TechTarget - The IT Media ROI Experts