Retrieving email and database archivesRetrieving email and database archives related information <<previous|next>>
Data storage compliance and archiving
Email archiving implementation: What you need to consider
By Dick Benton
As storage shops implement email archiving, many of them are confronting issues related to a company's industry or the capabilities/maturity of the IT organization and its email user community. If financial compliance is a key driver, the emphasis will be on proof points and mandated retention policies. For organizations susceptible to expensive legal discovery actions, retention and ease of retrieval will be paramount; firms with similar requirements might also need sophisticated search criteria.
The best practices presented here -- which are focused on archiving for Microsoft Exchange -- will ease some implementation, operational and maintenance concerns.
- Archive database management. Archiving applications write data to a database, which means that some of the tables require regular housekeeping. For example, the archiver might maintain a table that contains "questionable use" emails. If not serviced regularly, this table might increase the size of the database to a point where it affects the overall application. Some issues to consider include determining what the appropriate safety overhead is and if the suspect table can trigger a threshold alarm. Find out how the alarm works and determine who will address it.
- File purging and housekeeping. If data is written to an unstructured file, look for those functions that require regular purging. An example here is the Exchange journal. Consider the operational impact on the chosen storage tier if archiving isn't possible for a period of time. An operational threshold metric should be set to trigger an emergency alert if regular journal purging fails to occur. This may happen if the archiving environment itself experiences server, network or storage failures.
- Archive file structure. The archiving file structure is where the actual data is kept and, hopefully, it's in a single-instanced store. In large organizations, there will be tens if not hundreds of Exchange servers. An archiving application will likely be able to run on multiple servers in some ratio to the Exchange servers, but find out if the archive and index data is written to a single data store. If not, consider the additional management effort required to manage this type of environment, particularly as it scales.
- Application availability. Email is increasingly a mission-critical application, and many organizations look to clustering or other high-availability techniques to keep their Exchange environment operational. An archiving application may also require similar availability (if only to maintain journal purging capabilities). The architecture to support this can add significantly to operational overhead. Unless some kind of automated failover capability is included in the archiving product, Windows clustering or a similar technology can double the number of servers to be managed.
- Load balancing. With multiple archiving servers supporting multiple Exchange servers, load balancing can become a key issue. Manual load balancing is time consuming and an inefficient use of resources. Look for some form of automated load-balancing capability, either through traditional middleware or, preferably, within the archiving application.
- Index rebuild. Even in the best-run environments, an index will occasionally become corrupted. When this index points to literally millions of email entities, a rebuild can be a nightmare. Issues to consider include the following: Are rebuilds transparent to users? Is Exchange operation compromised? What's the impact on journal purging during an index rebuild? When the index is finally rebuilt, does Exchange or the archiving software need to be rebooted? Index rebuilding can significantly affect administration overhead, recovery times and end-user service levels.
- Client agents. To achieve high levels of user transparency, some products require the installation of an agent on each Exchange client (Outlook). This might make life easier for users, but it can be a major burden for IT administrators, particularly in companies that don't have efficient methods for pushing out client agents. Installing client agents is an ongoing process that will need to be repeated for many revisions and changes to the archiving software or Exchange.
- Reporting and metrics. With email archiving, you'll be dealing with an astonishing amount of information that's saved daily in an endlessly expanding archive. To manage the environment, you need to know the number of emails and attachments (and their size) moving through your Exchange, archiving and tiered storage. Managing user-retrieval needs and allocating appropriate class of service means you need to have an aged analysis of email and be able to determine the last-accessed date for each age group. A comprehensive metrics component will help you effectively administer the archiving environment.
- Backing up the archive. Once data is archived, it doesn't make sense to keep backing it up in full every week. Archived data can be backed up to two or three copies, and then not require attention until the media refresh threshold is reached. An archiving application must allow archived email to be moved down through storage tiers based on age, where only the top tier is backed up regularly. In addition, find out if the archiving product requires services to be manually shut down before backup takes place.
- Scalability. The main scalability issue is how much hardware the archiving application requires to support the Exchange environment. This may be expressed as the number of Exchange servers that can be supported by an archiving server, or how many archiving servers are required to support a particular number of mailboxes or emails. Whatever the metric chosen, it's important to consider what happens when email growth is projected out three to five years. If the key metric -- such as the number of Exchange servers or mailboxes -- doubles, will the archiving solution also need to double?
Managing an archiving environment isn't only about making users happy with transparency, risk managers happy with retention, legal people happy with search criteria and IT people happy with a smaller Exchange database. You also need to understand the investment you may need to make in additional administrative overhead to manage your archiving environment.
By considering these 10 issues, along with archiving feature and function requirements, you can align your understanding of the process with vendor statements and real-world results.
Do you know…
About the author: Dick Benton is a principal consultant at GlassHouse Technologies, Framingham, MA
03 Oct 2006
Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.