Rein in e-mail storage

With new government regulations and users' gigantic e-mail attachments, new approaches for storing e-mail are called for.

This Content Component encountered an error
This article can also be found in the Premium Editorial Download: Storage magazine: Tips for unifying storage management:

Save everything?
Mark Diamond, CEO and president of the storage consulting company Contoural Inc., Los Altos, CA, thinks every e-mail should be saved. When organizations have e-mail deletion policies in place, Diamond says, users end up storing e-mails on local drives, CD-ROMs or elsewhere. In the short term, deletion policies may keep down the cost of storage, but in the long term they drive up the cost of management and discovery, Diamond says.

Keeping the right e-mails is a huge problem. Organizations need to spend a disproportionate amount of time and money to filter all of their e-mails to find out what to keep and what to toss. Rather than doing that, he encourages organizations to consider saving all of their e-mail to help ensure compliance.

That requires policies that archive storage by transparently moving it between different types of disk media in the background. This permits older messages to remain online and accessible by users at a nominal cost to organizations, while allowing the organization to search and access it in the background.

To get ahead of the curve, Diamond recommends that the CFO, general counsel, general compliance officer and records management director of large organizations meet to determine what they need to save in the most effective way. Smaller and midsize organizations should look either to their CFO or director of IT for leadership in this decision. As they meet and form policy, they need to have a clear understanding of the business drivers and problems they are trying to solve and steer clear of the technology.

The e-mail administrator closes her eyes, grits her teeth and agonizes over the latest corporate e-mail mandate. Management wants e-mail cleaned up. Internal reports tell them that users aren't deleting old e-mails as they promised.

In fact, some users are still keeping 2GB mailboxes containing four-year-old e-mail messages. But with new compliance regulations, explosive e-mail growth rates and little money for new storage, the tolerance for user excess is at an end. As a result, management wants something done now.

Doing something is always the easy part--not infuriating users is the difficult part. Avoiding outages and minimizing the downtime for general maintenance procedures becomes extremely critical with e-mail now a 24x7 mission-critical application in almost every business. So, do existing or new features in Microsoft Exchange, Lotus Domino and Novell GroupWise offer any hope?

E-mail administrators need experienced storage managers to start to intervene to help them manage the burgeoning storage growth behind their e-mail infrastructure. The latest releases of the major e-mail packages do give e-mail administrators some additional options for limiting and reducing message retention periods and storage requirements. Yet for large or rapidly growing environments, organizations need storage experts to help them better manage their e-mail databases, content and aging messages.

As storage managers step in, they should look to third-party providers such as KVS Inc. in Arlington, TX, Lucid8 in New Castle, WA, and ZipLip Inc., located in Mountain View, CA, to help them. For example, Lucid8's GOexchange software automatically checks Microsoft Exchange 5.5, 2000 and 2003 databases for consistency, while correcting errors and reducing the e-mail database size in the process. KVS' Enterprise Vault for Exchange allows users to think they have a mailbox of unlimited size, but transparently archives and compresses their storage in the background.

Another product, ZipLip's Unified E-mail Archival Suite, includes built-in security that encrypts and authenticates archived e-mail for Domino, Exchange and GroupWise databases.

With products like these now available to support mission critical e-mail applications, organizations need to rethink how--and who--should be managing their back-end e-mail storage problems. Storage managers can make a major contribution toward solving them on four fronts.

First, they can help organizations measure and report on their e-mail infrastructure to give them some direction on how to proceed. Second, they can reduce the amount of e-mail currently under management by filtering, deleting, compressing or just plain deleting it. Third, they can deploy the appropriate backend levels of storage to house the data. And lastly, they can improve the quality and types of backups and restores available by incorporating advanced snapshots and replication features into their products.

Mailbox measurement
So, what should be your first step? First, you need to gain greater visibility into your e-mail database. While different tools provide different kinds of reports, administrators should at a minimum expect any e-mail management tool that measures their environment to be transparent to users and produce some statistics such as the average user's mailbox size, message content and the average length of message retention.

Sherpa Software, Bridgeville, PA, has a product called Mail Attender 6.0, which brings these types of capabilities into Lotus Domino and Microsoft Exchange environments. It reports on the age, size and content of messages, including message attachments. It also measures space utilization and reclamation statistics. Additionally, the product examines and reports on the growth rate of e-mail at both enterprise and individual levels. And most attractively, it can also generate these reports without any end user disruption.

Other specialized tools exist that provide additional insight into each organization's e-mail environment. For those companies looking to meet legal reporting requirements, E-mailXaminer from Legato Systems Inc., Mountain View, CA, enables administrators and auditors at the National Association of Securities Dealers (NASD) to monitor both incoming and outgoing electronic communications.

Meeting new legal e-mail requirements is a huge job. Mary Kay Roberto, KVS Inc.'s North America general manager, says that the SEC requires the capture of every e-mail coming in and out of the system before the user even receives the e-mail. Under SEC Rules 17a-3 and 17a-4, financial institutions must keep information for three years, with two years of the information remaining easily accessible. More specific requirements exist for broker/dealers under NASD regulations 3010 and 3110.

The compliance officer tasked with the supervision of that data scans some incoming and outgoing e-mail messages manually and runs key word scripts that search all of e-mail to ensure regulatory compliance.

E-mail backup management

For reporting at an operating system level, storage administrators may want to turn to Double-Take, a product from NSI Software, which is located in Hoboken, NJ. While this product usually gets billed as a Windows-based replication tool, it can also unobtrusively gather more technical statistics on e-mail databases. It can gather statistics such as the amount of read and write I/Os generated by the e-mail application, how long an e-mail database file system replication would take and it can also identify peak periods of e-mail activity.

From data gathered, Jason Buffington, the director of business continuity at NSI Software, located in Hoboken, NJ, says that NSI has found that in many of its customer environments, 95% of e-mail I/O traffic is read and only about 5% is write.

These kinds of statistics will help storage managers to better determine exactly where to place the e-mail data on arrays and figure out how to configure the arrays themselves. In cases such as the 95% read I/O statistic, storage managers will probably want to place this e-mail data on mirrored disk to expedite I/O traffic. They may also want to load these arrays with extra cache to further improve performance because if the e-mail message is prefetched from disk to cache, the response will be even quicker.

The e-mail server's performance may take a hit, depending on the nature of the e-mail management software running. NSI's Buffington reports that NSI's Double-Take software usually results in an average CPU utilization hit of 2% to 5%, although that percentage will certainly vary, depending on the mail server's write I/O load.

Others such as Sherpa Software caution users that the load users will experience will be a direct correlation to the number of mailboxes on the server, what reports users run, how many reports they run and the conditions and actions of the rules users put in place. Sherpa Software finds that most of its customers usually choose to schedule rules to be run late at night, when e-mail traffic is at its lowest point.

E-mail management

Cutting down on spam
There are two simple and relatively painless steps administrators can take to stem the flow of spam into their organization.
Implement a DNS real-time blacklist filter.
This filter obtains its information from a central service--of which many are free--that maintains a list of known spam providers. The filter checks the source of each incoming email against known spam providers and rejects any e-mails coming from these providers and plugs directly into e-mail servers like Lotus Domino.
Validate e-mail using full MX records.
Mail Exchange (MX) records may be used to verify that an e-mail from, for example, joe@storage.com really came from a mail server within storage.com such as mail.storage.com, not from spam.com. Spammers may embed the joe@storage.com and storage.com address in their e-mail that may allow it to bypass the DNS blacklist filter. However, the MX records function will reveal and confirm the true source of the e-mail server by checking with the real storage.com to see if joe@storage.com exists somewhere on storage.com. If he does, the e-mail goes through. But if he does not, this service prevents the e-mail from entering the organization.

Mailbox management
As storage managers help e-mail administrators document and understand their e-mail environment, the next step is to proactively manage their e-mail servers. Before managers begin to deploy any management tools, however, they should first identify what tasks they want to accomplish.

Initial tasks will most likely include spam filtering and deletion, along with a cleanup and compression of older messages located in the e-mail database. From there, storage managers will want to start archiving these older messages and improving the quality of the backups and restores. Once these tasks are identified, storage managers should then see if the management tools existing in the current e-mail application meet their needs before looking to deploy additional third-party software.

Lotus Domino, Microsoft Exchange and Novell GroupWise all contain management tools that allow administrators to perform certain basic administrative tasks. For instance, the latest release of each e-mail product allows users to set policies that delete e-mails after they have reached a certain age. Exchange 2003 gives users the option to choose when the online database compaction runs. This procedure compacts the database by defragmenting the data in the database files.

Novell GroupWise offers the GWCheck utility that has options for everything from analyzing and fixing databases for a post office to purging all messages that contain a specified subject.

The problem with most of these utilities is that once you get into larger environments with multiple e-mail database instances, it may become much too laborious for a single group of storage administrators to manage. Also, as e-mail messaging becomes more regulated, organizations will need to rethink any existing policies that automatically delete all e-mails that are 60 or 90 days old or enforce user mailbox quotas. At these points, companies need to simplify the management of these tasks through the use of third-party tools.

Some products tag e-mails for compliance reasons and automatically archive older e-mails by placing them on appropriately-priced storage. For example, Storage Technology Corp. (StorageTek), in Louisville, CO, has a product called Email Xcelerator, which contains an ArchiveMaster option that enables administrators to meet either advanced business or legal requirements for e-mail message archival. In addition to the automatic archiving of older e-mails, it permits administrators to set up policies that allow for messages and attachments to be migrated to the appropriate level of media. For instance, if certain messages need to be kept on write once, read many (WORM) media, policies can be set within ArchiveMaster that migrate messages from its current location to the WORM media.

Another area that administrators need to exert more control over is the e-mail's content. For example, Sherpa Software's Mail Attender 6.0 filters some content by preventing a user from sending e-mail to certain recipients.

ZipLip's Content Filtering Suite manages archived e-mail by generating a hierarchically searchable index on headers, messages and attachments that allows administrators to audit their existing e-mail message stores for Exchange, Domino and GroupWise environments.

E-mail attachments also need to be better controlled. Some of these issues are now handled in the e-mail applications themselves. For instance, if a user sends out an e-mail to 50 other users with a 2MB attachment, this attachment could in theory consume 102MB of disk space, 2MB in the sent mailbox of the user who sent it and 2MB in each recipient's mailbox.

The latest releases of Microsoft Exchange address this issue. Rather than sending a copy of the attachment to each user's inbox, it keeps one copy of the attachment and puts a link to the attachment in the other users' e-mails. Both Lotus Domino and Novell GroupWise have offered similar single store functionality for a number of years, but even here third-party products complement this feature.

Take AttachStor Inc.'s StorOnce, for example (see "AttachStor reclaims e-mail storage"). StorOnce, which is part of the AttachStor Suite, is installed as an add-on to Microsoft Outlook clients. It transparently compresses and encrypts e-mail message attachments, presenting the user with a link to the file stored on the central AttachStor server.

This technology especially shines for mobile users. Road warriors can download all of their e-mail with a thumbnail view of their attachments. If they want to view an attachment, they can choose to download the ones they want, as opposed to downloading every single attachment in their inbox.

In Lotus Notes environments, Exivity Inc., Westford, MA, has a product called AtomicDispatch, which provides a similar single instance message store capability. However, it handles the message store differently than StorOnce in two ways: First, it intercepts the message before the mail router and scans the size of the message. If it exceeds the established threshold, it replaces the message with a link and stores the message in its central database.

Top four reasons to consider using e-mail management software
  1. E-mail management software can assist organizations in measuring and reporting on their e-mail infrastructure, giving the organizations some clear direction on how to proceed.
  2. The software can reduce the amount of e-mail currently under management by filtering, deleting, compressing or just deleting all of it.
  3. E-mail management software can deploy the appropriate back-end levels of storage to house the data.
  4. Lastly, the software can dramatically improve the quality and types of backups and restores available by incorporating advanced snapshots and replication features.

Second, because it intercepts and scans the message on the Lotus Notes server, it does not require an agent to be deployed on each client, thus reducing the administrative work that's needed to deploy the software. The problem with this approach is that the recipient only gets a link to the e-mail message. So, they may be unable to determine the value of the text which is in the message.

Some companies may consider it a less painful alternative to take a less-refined approach to regulatory compliance and opt to archive everything and then urge users to delete all old messages. That approach should reduce the storage required for "active" e-mail, but only if the users complied. Establishing an official company policy would help encourage users to delete old missives, but you'd still be relying on them to find the time to be good corporate citizens. You might also find that you can narrow the scope of your storage-control efforts by determining which business units need to comply with legal regulations. By concentrating on those users, you may be able to reduce the amount of e-mail that needs to be filtered.

The bottom line here is that e-mail retention requirements are only going to get more and more stringent in the future. And with the cost of storage steadily dropping and litigation fees and penalties almost certain to rise, e-mail archiving is going to become a requirement for any conscientious organization. Of course, it's not the storage administrator's job to decide which e-mails should be kept and which ones should get deleted. That is the job of the company's compliance officer and members of various business units. (See "Save everything?")

Mailbox protection
Data protection is, of course, a storage admin's No. 1 responsibility. Backing up and restoring e-mail databases down to the mailbox, calendar, address book and individual message levels all present varying degrees of difficulty with ever-shrinking backup windows. Some major backup software programs such as CommVault Systems' Galaxy, Legato Networker, Tivoli Storage Manager, and Veritas Software's NetBackup offer enhanced e-mail integration tools with a new native feature in the Windows operating systems.

Microsoft Corp. offers a new API on Windows Server 2003 that allows access to the Windows Volume Shadow Copy service. Volume Shadow Copy enables a snapshot of the Microsoft Exchange database in just seconds. In conjunction with this new API, Veritas Software plans to release an update to its NetBackup product to capitalize on this new functionality with support for Lotus Notes and Novell GroupWise to follow if the does demand exist, a company spokesperson says.

Some users like Ken Marsh, who is the IT Manager of the construction firm T.B. Penick & Sons Inc., located in San Diego, could not wait for this functionality to appear in traditional backup products. With users irate anytime the e-mail system went down, he chose San Diego-based StoneFly Networks' i3000 Storage Concentrators to store his Exchange database. He found it was both an economical and efficient method in which to grow storage capacity for users, while it gave him the flexibility to create disk-to-disk backups. These options reduced the number of times he needed to take his Microsoft Exchange server offline for these administrative tasks.

EMC offer users who store their Exchange and Domino databases on their Symmetrix arrays the ability to either instantly backup or restore their databases using a combination of EMC's software products such as Data Manager, Data Manager client and TimeFinder.

Network Appliance Inc. (NetApp) also offers similar functionality for Exchange users who store their databases on NetApp Filers. They offer their SnapManager for Microsoft Exchange that makes Exchange snapshot-aware and creates the ability for online backups and rapid recovery of Exchange databases.

Using native and third-party tools to gain an understanding of the data in your e-mail application will go a long way toward better managing data. The challenge for current e-mail administrators will come in how they translate this insight into new management policies. Yet with ever-improving e-mail management tools and storage skills, organizations should be able to keep user angst to a minimum, meet increasingly stringent retention requirements and e-mail growth and still protect the company's valuable e-mail infrastructure.

This was first published in February 2004

Dig deeper on Data storage compliance and archiving

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close