In the past few years, a few small companies in the storage marketplace have focused their business on data classification, and some larger companies have launched their own data classification products. When information lifecycle management (ILM) was all the rage, data classification became a hot topic as a means to execute ILM and tiered storage.
But as the e-discovery process gets more complex, having a dedicated tool for identifying data becomes less workable. Some companies in this space have either been forced to close their doors, as Scentric Inc. did last September, or reposition their data classification products as a feature of broader offerings, focused on data indexing and search, as EMC did when it moved its Infoscape product into its security business unit last October, and as Kazeon and StoredIQ have done this year, each with a new focus on e-discovery.
"ESG never thought there was a data classification market," said analyst Brian Babineau of the Enterprise Strategy Group (ESG). "Data classification is the application of a specific technology, namely indexing and search. It was really the first such application of indexing and search outside of the consumer market. . .There are plenty of vendors that still allow customers to classify data. . .but rarely is classification the only way that a vendor applies the search and indexing technology."
Below is a list of data classification products, including the ones that use data classification for indexing and search.
Abrevity Inc.; FileData Classifier and FileData Manager
With FileData Classifier, once you find the files that match through the discovery process, the next step is to label or "tag" the files that may be dispersed across multiple locations in the enterprise. The tag becomes an extended attribute associated with the files and a policy or multiple policies can be assigned to the tag. Classification allows enterprises to associate a business value to their data, and create policies pertaining to that value. FileData Manager software enables federated control and enforcement of classification and data management policies.
Arkivio Inc. (subsidiary of Rocket Software); Auto-Stor
Auto-Stor software enables IT users to organize and classify their data and storage resources based on their relative importance to the business in order to drive the appropriate data to an online archive. In January, Arkivio was acquired by Rocket Software.
Brocade Communications Systems Inc.; Storage X
Brocade StorageX is file virtualization software that allows users to set policies to automate data management functions. It supports data classification and reporting.
Brocade Communications Systems Inc.; File Lifecycle Manager (FLM)
File Lifecycle Manager (FLM) version 4.0 allows NetApp users to overwrite policies and manually migrate files between devices. It also allows Active Directory groups, storage resource management and data classification tools to dictate file movement policies and track performance through Windows Performance Monitor.
CommVault Systems Inc.; Simpana software suite
The Data Classification Enabler feature of CommVault's Simpana backup and archiving software captures and tracks changes written to the Windows file system, and eliminates the need to perform a disk scan process to find file changes. This feature has a second use, which is to enable the Content Indexing of Windows file system data. This is necessary for keyword searches of that type of data. When used for this purpose, an additional component called the Online Content Indexer agent appears in the Unified Console for the client system.
FileTek Inc.; Trusted Edge Desktop Data Management Suite
The Trusted Edge Desktop Data Management Suite provides policy-driven control of unstructured data for documents and email at the desktop level. The software uses both auto-classification and user-classification methods.
IBM; OmniFind Discovery Edition
IBM's classification module for Ominfind Discovery Edition analyzes the content of documents and emails so they can be categorized. Users can bring more content under management without categorization tasks. A Classification Review Tool allows administrators and others to review automated actions and set the level of automation for their organization.
Index Engines Inc.; LAN Engine and SAN Engine
LAN Engine ingests network data using the NDMP protocol, so that indexing occurs at speeds up to 10 times faster than crawling. Data is quickly streamed to the LAN Engine and indexed, without making a cache copy of the data, generating a footprint that is only 5% to 8% of the original files. Following indexing, data can be immediately searched and classified.
SAN Engine supports direct indexing of data in popular backup formats. Whether data is backed up, replicated, snapped, archived, or vaulted indexing can be performed in a seamless extension to these processes. SAN Engine operates similar to LAN Engine but does the process instream.
Kazeon Systems Inc.; Information Server
Information Server enables companies to significantly shorten the time it takes to identify, collect, preserve and process electronically stored information (files and email) from various sources within the company, including individual desktops and laptops, branch office servers and data center resources (e.g. networked storage, email servers and archives), thereby curtailing unnecessary eDiscovery costs. Kazeon's Information Access Platform software for topology mapping, custodian inventory, data classification and search ensures that companies can reduce legal risk and minimize collection and preservation expense.
NextPage Inc.; Document Retention
NextPage applications are built on the company's Information Tracking Platform, which makes it practical to track and classify unstructured and unmanaged documents (whether stored on hard drives, email or scattered shared drives) enforce policies from retention, to disposition to legal holds, and then monitor compliance across the organization.
Orchestria Corp.; Records Management
Orchestria's intelligent control layer identifies, classifies, optimally stores, retrieves and deletes messages that are entering the archive. The product's Smart Tagging adds metadata at the source, including data elements like departments, job titles, job functions, cost codes-for both senders and recipients. Attached metadata ensures proper routing to the appropriate repository and facilitates rapid retrieval.
RSA Security Inc.; (EMC Tablus) Data Loss Prevention (DLP) Suite
RSA's DLP suite provides a policy-based approach to securing data, enabling customers to classify their sensitive data, discover that data across the enterprise; enforce controls; and report and audit to ensure compliance with policy. EMC's Infoscape data classification and management software, previously a storage-focused offering, has been folded into this product.
StoredIQ Inc.; Information Classification and Management Platform
StoredIQ's Information Classification and Management Platform provides information management solutions that automate the discovery, classification, protection and storage of unmanaged business information in files and email. The solution's discovery capabilities provide precise identification of files that have content deemed to require protection and management by the organization. This enables organizations to find, understand and manage the vast amounts of unmanaged information created in documents, spreadsheets, presentations and emails.
SearchStorage.com Assistant Editor Matt Perkins also contributed to this report.
This was first published in May 2008