This article can also be found in the Premium Editorial Download "Storage magazine: Big 3 backup apps adapt to disk."

Download it now to read this article plus other related content.

Search criteria
The granularity of a search depends on the elements the search application stores and examines. Basic search engines catalog the meta data associated with a file--typically information like file name, last access date, and the ID of the user who created or modified the file. For e-mail, sender and recipient, subject, date sent and other basic message information can be searched. Searching on such rudimentary meta data elements might work fine with a small pool of data, but will likely yield a results set that's impracticably large when searching voluminous data stores.

Only a handful of products still rely on such a simple, restricted meta data search. Some products allow users to customize the meta data by adding additional identifiers such as keywords or tags. Keywords help to narrow searches, but adding them is generally a manual chore and a uniform set of keywords must be maintained to ensure any degree of search consistency.

Rather than rely solely on meta data and keywords, most search app providers now index the content of the file or e-mail message and its attachments. This allows for much more focused searches, as the full content of the file is compared to the user's search criteria; this makes it possible to include more unstructured data in the scope of a search. "We're finding an increased need to go out and look at the content of the data," says Mark Diamond, president and CEO at Contoural Inc, a Mountain

Requires Free Membership to View

View, CA-based consulting firm. "So a tool that can only look at file attributes has limited value."

There are, however, some penalties associated with full-text indexing. First, it takes time and processing cycles to create the index, although most archivers manage the indexing process behind the scenes to limit any impact on application performance. The second issue is storage space. A full-text index of hundreds of thousands--or even millions--of data objects can result in an extremely large index that may use significant disk space and slow searches. Paring the size of the index is the Holy Grail of archiving, and vendors employ proprietary technologies to keep their indexed output as compact as possible. For example, Index Engines claims that its full-text index requires only 8% of the disk space required by the original files.

Reliability is another consideration. Vendors make the indexing process as transparent to users as possible, so little is revealed about its inner workings. That's generally good, but if the indexing process fails to complete properly, there may not be any indication that the source data wasn't fully indexed. This could result in searches that appear to be comprehensive but, because files were missed, the results aren't "correct and complete" in the eyes of the court, leaving a company vulnerable to considerable penalties.

"When indexing engines are performing indexing operations, they routinely fail just like any other software fails," says Peter Mojica, vice president of product management at AXS-One Inc., a records compliance management company in Rutherford, NJ. The failure could be extremely difficult to detect because a portion of a single document attachment or an e-mail failed to index properly. Mojica says AXS-One's Rapid-AXS Search & Retrieval system traps errors and notifies administrators that re-indexing may be required.

This was first published in April 2006

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: