This article can also be found in the Premium Editorial Download "Storage magazine: Big 3 backup apps adapt to disk."
Download it now to read this article plus other related content.
|Advanced search concepts|
Hyperbolic tree: A graphic representation of found instances of data that shows the relationships among the data objects.
Fuzzy search: A search engine with this capability can find variations in the spelling of a search word or even misspellings.
Phonic search: A phonic search will distinguish instances where two words may have different spellings but sound alike (e.g., Smythe and Smith).
Stemming search: A stemming search will find all the variations of the search word where the root of the found word matches the search word.
Natural language processing: Natural language processing uses the context of a file to determine if the search result matches the intent of a search; for example, by distinguishing between "sue" and "Sue."
Brian Erdelyi, information security officer at Toronto-based Blackmont Capital Inc., learned this lesson while working at another company, where a high-profile legal case required restoring all e-mail for 12 specific users. "Because this company was so large, everybody's mailbox was on a different server," recounts Erdelyi, "so we had to restore 12 servers 12 times for each month in the past year." Because they had to restore the data from tape, it took them approximately six months to complete the chore. With his current setup at Blackmont using Fortiva Inc.'s archiving service, Erdelyi says a similar request was completed in "a day or two."
John Hegner, vice president of technology services at Liberty Medical Supply Inc., Port St. Lucie, FL, implemented iLumin Software Services Inc.'s (now owned by Computer Associates International Inc.) Assentor Discovery after he had to restore e-mail messages from backup tapes. He sent tapes to a data restore service that charged "over $100,000 for the work," says Hegner. Intent on avoiding a similar experience--and expense--he installed iLumin's product so that he could keep restore efforts in-house.
Incidents such as these are quite common. They're a stark reminder that new data retention and retrieval requirements are beyond the scope of traditional backup apps and procedures, and generally mean adding new archiving/search tools to the storage environment.
The trick is to know how deeply a search will need to delve into archives. In many cases, companies implemented archiving specifically to address storage and application performance concerns. By paring down the amount of application data stored on pricey, higher performing storage, companies could forestall buying additional primary disk. Searching through those data archives may have been a secondary consideration initially but, as the archives grow, the demand for search capabilities also grows.
Regulatory compliance and litigation aside, companies have found that search tools can help identify potential legal problems and, in doing so, may avoid legal issues altogether. Jim McGann, vice president of marketing at Index Engines, Holmdel, NJ, sees this application of search technology as a growing area of interest, citing an investment bank customer that routinely searches e-mail for certain words and then saves those search results in encrypted .PST files.
Erdelyi has been archiving Blackmont Capital's e-mail with the Fortiva Suite of outsourced services for approximately a year and anticipates using the search function proactively in some cases. "There's a feature within Fortiva where I can set up policies that are effectively keywords," says Erdelyi. "If these keywords are detected, [the e-mail] can be flagged for review." It's an intriguing prospect: An HR group could detect an instance of harassment or other improper behavior long before it became a serious legal matter. Besides detecting HR-related indiscretions, proactive searches can keep a company compliant. "There are certain code-of-conduct rules that our traders have to follow," notes Ederlyi, "so we set up keywords that will trigger [certain documents] and our compliance department will review those on a daily basis."
This was first published in April 2006