Home > Storage Magazine > Features > Finding Data
EMAIL THIS LICENSING & REPRINTS
Storage Magazine

  CURRENT ISSUE  

  FEATURES  

  TOOLS, TRENDS & ANALYSIS  

  COLUMNS  

  ARCHIVES  

  SUBSCRIBE/RENEW  
 

Finding Data
by Rich Castagna
Issue: Apr 2006
printer-friendly
licensing & reprints
< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   NEXT PAGE  >

Search criteria
The granularity of a search depends on the elements the search application stores and examines. Basic search engines catalog the meta data associated with a file--typically information like file name, last access date, and the ID of the user who created or modified the file. For e-mail, sender and recipient, subject, date sent and other basic message information can be searched. Searching on such rudimentary meta data elements might work fine with a small pool of data, but will likely yield a results set that's impracticably large when searching voluminous data stores.

Only a handful of products still rely on such a simple, restricted meta data search. Some products allow users to customize the meta data by adding additional identifiers such as keywords or tags. Keywords help to narrow searches, but adding them is generally a manual chore and a uniform set of keywords must be maintained to ensure any degree of search consistency.

Rather than rely solely on meta data and keywords, most search app providers now index the content of the file or e-mail message and its attachments. This allows for much more focused searches, as the full content of the file is compared to the user's search criteria; this makes it possible to include more unstructured data in the scope of a search. "We're finding an increased need to go out and look at the content of the data," says Mark Diamond, president and CEO at Contoural Inc, a Mountain View, CA-based consulting firm. "So a tool that can only look at file attributes has limited value."

There are, however, some penalties associated with full-text indexing. First, it takes time and processing cycles to create the index, although most archivers manage the indexing process behind the scenes to limit any impact on application performance. The second issue is storage space. A full-text index of hundreds of thousands--or even millions--of data objects can result in an extremely large index that may use significant disk space and slow searches. Paring the size of the index is the Holy Grail of archiving, and vendors employ proprietary technologies to keep their indexed output as compact as possible. For example, Index Engines claims that its full-text index requires only 8% of the disk space required by the original files.

Reliability is another consideration. Vendors make the indexing process as transparent to users as possible, so little is revealed about its inner workings. That's generally good, but if the indexing process fails to complete properly, there may not be any indication that the source data wasn't fully indexed. This could result in searches that appear to be comprehensive but, because files were missed, the results aren't "correct and complete" in the eyes of the court, leaving a company vulnerable to considerable penalties.

"When indexing engines are performing indexing operations, they routinely fail just like any other software fails," says Peter Mojica, vice president of product management at AXS-One Inc., a records compliance management company in Rutherford, NJ. The failure could be extremely difficult to detect because a portion of a single document attachment or an e-mail failed to index properly. Mojica says AXS-One's Rapid-AXS Search & Retrieval system traps errors and notifies administrators that re-indexing may be required.

< PREV PAGE   |   1  |   2  |   3  |   4  |   5  |   6  |   NEXT PAGE  >





TechTarget Storage Media
Storage Magazine View this month\\'s issue and subscribe today.
Storage Decisions Apply online for free conference admission.
SearchStorage.com
HomeNewsMagazineTopicsLearningMultimediaWhite PapersBlogsEventsAbout Us

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2000 - 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts