Hopkinton, Mass. -- Archiving is becoming one of the biggest headaches in storage management today, namely because there is no single, cohesive approach to archiving everything, according to customers at an EMC user group meeting at the company's headquarters on Tuesday.
About 20 or so EMC Documentum users gathered for the first quarter 2007 meeting of the Northeast EMC CMA User Group, formerly known as EDUNE. The issue that cropped up the most was companies having to knit together different products in order to have a complete archive for records management that would also satisfy legal discovery requests.
He added that if there has to be separate archives, then the "architecture has to be fluid" to allow data to flow between them. "EMC is very good about this concept of storage at the physical layer, but there needs to be a tiered structure between the software layers as well." This same model must extend to all documents, not just email, he said.
His sentiments were echoed by another user in the insurance industry, who said that his company is pulling in 1.5 million emails a day and is struggling to figure out what to keep and where to store it.
He calculated that with the 40,000 PST files his company reluctantly stores, it would take 4,000 hours of productivity a day to correctly classify that data. "I don't think I can get approval for that," he joked. Furthermore, two people getting the same email may classify it differently, so which is the correct place and retention for it? Ultimately, he said he's decided that automated classification is the only way to go, even though there will be a margin of error in this approach, too. "We need federated policy management that knits together policies across all systems," he said.
Another user volunteered that his company had identified 300 different document classifications and that none of the classification tools came close to meeting this requirement today.
A spokesperson for EMC asked the audience whether they thought keyword classification on every single email that comes into an organization was a good idea, versus full-text indexing. The consensus among users seemed to be to start with high-level classifications and then get deeper into it as the software and users get more sophisticated. "Don't forget your classification system is only as good as the queries your people write," one user said.
Mergers and acquisitions are another factor driving users to figure out how to standardize on a single records management system. "We need to be able to bulk load 20 million documents a day into Documentum … there's no good options when company A using a common store and company B using Documentum comes together," another user said. "The repositories are completely different."
An EMC official said that Documentum 6.0 (D6), expected later this year, will address many of these issues. In the meantime, he advised users to consider EmailXtender for email that has a shorter life span and Documentum for longer term records management. But it's clear from users this is easier said than done.