Managing and protecting all enterprise data


Legal toolkit for storage systems

Storage managers may be reluctant to admit it, but they and the storage systems they manage are key players in most companies' compliance and legal readiness procedures. While ediscovery is the current buzzword, there's currently no all-encompassing ediscovery tool on the market. But you can assemble an effective toolkit with some of the point products that are available now.

There's currently no all-encompassing ediscovery tool on the market. Here's a guide to some tools that cover different parts of the process.

More than 300 vendors touted their ediscovery tools at the LegalTech conference in New York City in early February, and visitors were hard-pressed to sort them out. Even analyst firm Gartner Inc. appears daunted by the prospect of wading through this abundance of specialty software, services and outsourcing. "There are over 1,000 vendors who label themselves ediscovery," says Debra Logan, a Gartner research VP and author of "Market-Scope for E-Discovery and Litigation Support Vendors, 2007."

Vendors are besieging storage managers with tools that promise their version of a get-out-of-jail-free card. While storage managers won't actually face jail for data management shortcomings--the new rules apply to civil litigation--they may be hauled into court to testify about how data is stored in their organization. With Logan's Gartner ediscovery Magic Quadrant still a work in progress and only the "eDiscovery Vendor Landscape" report from Cambridge, MA-based Forrester Research Inc. to go on, corporate IT is on its own when trying to determine which ediscovery products best fit their storage environment and needs.

Don't expect the big platform vendors to come up with a comprehensive solution any time soon. EMC Corp., Hewlett-Packard (HP) Co., IBM Corp., Microsoft Corp., Oracle Corp. and Sun Microsystems Inc. are sticking their toes into the ediscovery market, but are mainly hunting for acquisitions. For example, EMC and IBM purchased enterprise content management tool vendors Documentum and FileNet, respectively. And Microsoft acquired enterprise search product vendor FAST Search & Transfer ASA. None of these products, however, represents more than a piece of the ediscovery toolset.

"I don't know if there will be one end-to-end solution. Ediscovery is really broad and heterogeneous," says Vivian Tero, program manager, compliance infrastructure at Framingham, MA-based IDC.

The best place to start the tool selection process, analysts and vendors agree, is with the Electronic Discovery Reference Model (EDRM).

Inside EDRM
EDRM was initially conceived by George Socha Jr,. founder of Socha Consulting LLC in St. Paul, MN, and Tom Gelbmann, managing director of Gelbmann & Associates in Roseville, MN. The reference model divides the ediscovery process into six areas--information management, identification, preservation/collection, processing/review/analysis, production and presentation--and identifies the functions associated with each area. Storage is typically involved in the earlier stages of the model.

"If you don't have experience in ediscovery, EDRM is useful. It is good for showing what the issues are," says Mark Brennan, counsel at Bryan Cave LLP in Kansas City.

"For an IT person faced with finding ediscovery tools, the first thing I would do is take the EDRM diagram and go talk with your legal counsel," says Matthew Todd, CISO and VP of risk and technical operations at Palo Alto, CA-based Financial Engines Inc. The legal counsel should tell you which functions the IT group should do in-house. Then you can start looking at tools.

Where EDRM breaks down is with naming tools. "It's very useful, except it doesn't name tools," says Lance Rea, CIO at New York City law firm Davis & Gilbert LLP. Actual products tend to overlap EDRM areas. For example, an archiving tool from the information management area will also perform search, making it useful for identification. Rea performs ediscovery searches with Microsoft Outlook and the firm's archiving tool, EMC EmailXtender.

The areas of ediscovery that involve storage the most include information management, identification and preservation/collection. Information management covers those things the storage group already handles, such as content management, records management, document management, CRM, email archiving and data archiving.

"Even with all these systems, it's hard to get to unified content management, which is where you ideally want to be. It takes several years to get to that point, and you still face fire drills when an ediscovery request comes in," says Jason Priebe, of counsel at the Chicago-based law firm Seyfarth Shaw LLP, and formerly the ediscovery practice manager at a major property and casualty insurance company. "Even with tools like Oracle Content Server or Documentum, it still takes years to sweep up all the unstructured data," he adds.

Just getting the data unified and into one place isn't enough. Every company needs a data-retention policy it actively follows and enforces. "The best thing the storage people can do is to support the email retention policy," says Jack Hupper, CIO at the New York City Law Department. In terms of tools, that means deploying email archiving applications, such as Symantec Corp.'s Enterprise Vault, and tools for classification and categorization like those from Autonomy (which owns Zantaz Inc.) or Recommind Inc.

Identification is the process of determining the scope, breadth and depth of the electronic information that might be needed in litigation. You should start by identifying the largest pool of potentially relevant data, which requires search tools. For the purposes of identification, the search tools don't even have to be that good. "You could use Outlook or even Google," says Rea at Davis & Gilbert. At subsequent stages, when culling the data to zero in on the exact documents you need to turn over, you'll need more sophisticated search tools like Kazeon System Inc.'s Kazeon Information Server or even forensic search tools like Guidance Software Inc.'s EnCase suite of products--but that comes later and probably needs to be handled by lawyers. If the amount of data to be searched is vast, many companies call in outsourcing vendors.

Legal holds
Preservation/Collection is the next and probably last piece that directly involves the storage team. Preservation is the process of protecting data that may be needed in the litigation, and there are big penalties for failing to do this right. For this step, IT needs a legal hold tool. PSS Systems is widely recognized as the leading legal holds vendor, while Exterro Inc. is a newcomer in this area. But any product that can enforce policy against stored data could work. Ideally, the legal hold tool should identify and preserve both centrally managed documents and distributed documents, including those residing on desktops and laptops.

A key part of the legal hold process is communication; the tool should therefore assist in alerting people of the hold and monitoring compliance (see "Ediscovery readiness assessment," below). This means immediately notifying custodians and anyone else in possession of relevant data. The tool also needs to track responses acknowledging the hold, generate hold reminders (usually every 60 or 90 days) and then track reminder acknowledgments. Along the way, the tool should document every message sent and each acknowledgment. The amount of messaging quickly adds up.

For example, a company with 200 active legal matters, each involving 50 employees, equals 10,000 emails. At least three reminders are usually sent to each involved employee during the time of the legal hold. In addition, each message must be acknowledged by the recipient, which drives the number of emails even higher. The scale of this effort can be enormous, as some Fortune 1000 companies may have several thousand open legal matters.

Collection is the process of gathering all of the documents you've preserved and putting them into a form that can be delivered to the attorneys. "Electronic information should be collected in a manner that's comprehensive, maintains its content integrity and preserves its form," says consultant Socha. "Increasingly, meta data is required to be collected and maintained during this process, and information regarding the chain of custody and authentication is required."

Failure to collect the data correctly can lead to big fines or even blow the lawsuit. When storage people are called into court to testify, it's usually to answer questions about how the data was managed, preserved and collected. Among the many vendors offering tools in this area are CommVault, Electronic Evidence Discovery (EED) Inc., Guidance Software and MetaLincs (a Seagate company). Beyond this point in the EDRM, the storage staff can generally step aside. Even if the company is handling more of the process in-house, the litigation team takes over.

PSS Systems takes a slightly different view of the ediscovery landscape. It divides the process into two parts: activities handled in-house with the involvement of IT and activities turned over to outside lawyers.

IT focuses mainly on generating, storing, collecting, identifying and exporting documents, while outside lawyers handle detailed search, review and production.

Tool selection
The ediscovery market is getting big fast. IDC reports that $12 billion was spent acquiring ediscovery tools in 2007, and predicts that figure will rise to $21.8 billion by 2011. With outsourcing and licensed software available, the pricing and packaging of tools varies widely, from cost per GB to cost per mailbox to cost per seat. Depending on the services involved, ediscovery tools run anywhere from $3,000/ GB to $30,000/GB. Pricing for enterprise licenses starts around $200,000 and climbs into the millions.

Tool acquisition begins with information management and especially retention management (see "A sampling of ediscovery products," (PDF) below). "You need more than just content management," says Hupper at the New York City Law Department. "The tool needs to manage retention. You need a way to tell the system the retention period for a document is up and [to] get rid of it."

Click here for a sampling of
ediscovery products (PDF).

Hupper's department uses Interwoven Inc.'s WorkSite, which is popular with the legal industry. The tool stores most kinds of common electronic documents--Office documents, WordPerfect, PDFs, Visio and email. Most importantly, "we can tell the system when the retention period is up and it takes care of it," says Hupper. Retention management requires the ability to set and enforce policy at the individual document level if need be.

A company can use almost any archiving tool to set specific retention periods for documents and files, and then tie these retention periods to data storage or data destruction policies. At a minimum, the tool will need to be able to store and search PST files (Microsoft Outlook email files).

Search tools are what most IT people think of when they hear ediscovery. More advanced search tools apply intelligence to perform contextual search based on the content of the document, dive deep into the meta data or look for behavioral patterns. Some will handle deduplication and near deduplication, in which small differences in otherwise nearly identical documents are highlighted.

Autonomy/Zantaz, FAST Search & Transfer (now owned by Microsoft), Kazeon Systems, Mimosa Systems Inc. and StoredIQ Inc. are examples of ediscovery search vendors. San Jose, CA-based Cypress Semiconductor Corp. used dtSearch, a fast, basic text search tool for a small ediscovery situation. When the company became involved in a major litigation situation, its legal department recommended the ediscovery process be turned over to an application service provider (ASP) at a cost of $1 million. Instead, Tony Smith, IT director at Cypress Semiconductor, turned to a combination of products, including MetaLincs, for search and analytics. "We were able to do it for less than half of what the ASP would have charged," says Smith.

Storage managers use a number of criteria when selecting ediscovery tools (see "Ediscovery tools checklist," below). They also need to pay attention to such issues as chain of custody, spoliation and how the tool passes information to other parts of the process.

"Chain of custody is the ability to account for a document from the time it's identified to its handoff to the attorney," says Priebe at Seyfarth Shaw. Look for tools that provide an audit trail or "just create a log to track the document," he adds. Spoliation refers to the intentional destruction of a relevant document and/or its meta data by a custodian of that evidence. The legal hold tool puts all custodians on notice: There are no excuses for spoliation.

When you're concerned about chain of custody, you may want to take some extra steps. For its big litigation case, "we had to use a hardware-based write blocker to make sure the meta data didn't change," says Smith.

Passing the documents collected at various stages in the ediscovery process has traditionally been a manual effort involving the transfer of files and the creation of TIFF files. Some vendors provide integration with the leading litigation management tools used by law firms, such as those from CT Summation and Concordance from LexisNexis. Consultants Socha and Gelbmann have proposed an XML transfer protocol to smooth these handoffs and integration challenges. Called EDRM XML, the protocol will provide a standard XML schema to facilitate the movement of electronic information from one step of the ediscovery process to another. The standard will address both the underlying ediscovery materials (email messages and attachments, loose files, databases) and the meta data used in ediscovery processing and production. So far, a dozen tool vendors have signed up to support EDRM XML. If the standard takes hold, it should greatly facilitate the document handoff.

"EDRM XML has a lot of promise. There are so many vendors and so many ways to move data that it would be nice to have just one way [to do things]," says Brennan at Bryan Cave.

There probably is no single tool that will meet an organization's entire ediscovery needs. Almost all large companies have content management, email archiving and search tools that can do much of the job when used in conjunction with a policy-based retention management system. For example, Pinnacle Financial Corp. was looking for email archiving to better manage its Exchange email stores. "This was just before the revised FRCP [Federal Rules of Civil Procedure]," says Rick Chin, senior VP of information technology at the Orlando, FL-based firm. "We didn't have any litigation pending, but I knew litigation could pose a multimillion dollar risk."

After reviewing archiving tools from Symantec and Mimosa, Pinnacle opted for Mimosa because "it presented a clear interface," says Chin. Otherwise, the tools were similar in price and functionality. Now that Pinnacle Financial has installed Mimosa, the company has gotten its email archiving under control and implemented a retention policy. It hasn't faced any litigation yet but has tested the ediscovery capabilities with its legal department. Should the company be hit with a lawsuit, Chin is confident he has ediscovery covered.

That's the real reason why most companies should look at ediscovery tools. "It's not just about preparing for litigation. It should be about how your business performs," says Financial Engines' Todd. "Being able to quickly pull up contracts or records or customer data is good for the business." Having a retention policy, unifying the storage and management of data, and being able to search it fast can help the business every day. And should a lawsuit hit, you'll be prepared.

Article 8 of 20

Dig Deeper on Data storage compliance and regulations

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All