Hewlett-Packard (HP) Co. is tapping into the cloud to provide data classification Software as a Service (SaaS) for building taxonomies for records management. The Web-based service, called Taxonom, is in private beta tests and will likely be generally available later this year.
Taxonom helps IT administrators and others such as content management pros and corporate librarians quickly build relevant taxonomies for the enterprise information they commonly deal with. The program can cut the time to create such taxonomies from months to days.
Companies often pay consultants to build custom taxonomies with a corresponding catalog used to provide a conceptual framework for information retrieval. Taxonom is designed to generate the taxonomy on the Web, saving both time and money, according to Jean-Luc Chatelain, chief technology officer, information optimization, HP Software & Solutions, Technology Solutions Group.
To use the service, organizations describe their business domain and the service compiles a standard taxonomy for that business domain based on a knowledge base compiled by HP from various sources and stored in the cloud.
"On the roadmap is the ability to have people feed their own taxonomies into the knowledge base," Chatelain said. The taxonomy would also be built using ontology, or a working model of entities and the interactions between them.
Chatelain said if a customer entered "dentistry," a taxonomy might be generated containing the terms "hygienist," "filling," "drill" and "patient." Ontology organizes the taxonomy according to the relationship between those concepts – the dentist hires a hygienist and uses a drill to give a patient a filling. Ontology can also be used to distinguish between different definitions of the same concept, such as whether "Jaguar" refers to a car or an animal.
There are already data classification and indexing tools, such as HP's TRIM enterprise content management and Integrated Archive Platform (IAP) appliance, that can ingest taxonomies and use them to classify data. A Taxonom user would still need such a tool to complete the data classification process.
"Most companies have very complex records management policies written by lawyers," Chatelain said. "Taxonom could feed TRIM to ultimately get the policy together with the data, building toward the Holy Grail, which is an automated records management solution."
IAP customer Iain Liddell, policy development manager at Brunel University in West London, UK, is beginning to add ontology and data classification into other parts of his infrastructure such as email filtering. Liddell said he would welcome the ability to add those capabilities to IAP. He said building internally developed taxonomies can take too long to remain relevant, especially at an educational institution. "If we put something together in August or September, by the time January or February rolled around, we'd be six months behind the curve," he said.
HP's Chatelain said users will be able to update the taxonomy as many times as they want. Although Hewlett-Packard hasn't settled on a pricing model for the service, it's possible the model would consist of a subscription for frequent updaters and a fee per taxonomy file for those who use it sparingly.
Chatelain said HP has no current plans to integrate Taxonom with other products.
Brian Babineau, a senior analyst at Milford, Mass.-based Enterprise Strategy Group, said data classification and indexing tools like those offered by Autonomy Corp., FAST and Attivio Inc. can generate taxonomies for a user. "The issue with auto-generated taxonomies is that they have no context, i.e., what business you are in," he said.
"If you know what kind of action you're going to take after you organize your data, building your own [through a service like Taxonom] makes a ton of sense. If I am going to identify and manage records, I build my taxonomy to help me find records," Babineau explained. "If I have no idea what I'm going to do once I know more about my data, an auto-generated one may tell me 'You have a ton of forms with patients' medical records -- maybe you should do something about it.'"
While Chatelain said Taxonom won't be exclusive to IAP and TRIM, the vendor is looking to prop up those products after falling behind in the content management and archiving space with its prior archiving platform, the Reference Information Storage System (RISS). HP overhauled RISS and renamed it IAP in 2007, but has yet to catch up with market leaders in this space.
"HP needs to build awareness around their information management solutions," Babineau said. "Not many customers put them on their shortlist for email archiving or content management."