This article can also be found in the Premium Editorial Download "Storage magazine: Email storage lessons learned from Citigroup."
Download it now to read this article plus other related content.
|Best practices for e-mail storage|
Use SATA drives
Plan to use SATA drives with mean time between failure (MTBF) ratings of 1 million hours or more. They cost more than lower-rated SATA drives, but reducing drive failures saves money in the long run. Try not to partition individual drives among too many servers because SATA drives aren't good for overlapped I/O. SATA is terrific for a few e-mail servers, but I wouldn't want more than five servers accessing the same physical disk drives.
Keep that in mind when you partition storage in your virtualization system. Use RAID 1, 5 or 10--whatever seems easiest to manage. Don't worry that RAID 1 won't have the scalability you need--scalability can be handled by the virtualization product. Also, don't use SATA drives with write caching turned on; write caching won't deliver noticeable performance advantages and it increases the risk of data loss.
Primary and secondary storage
Whereas primary e-mail storage can be fairly generic and replaceable, your e-mail archiving system should be designed to last awhile. There are many software companies targeting products at data retention and regulatory business requirements, including the administrative challenges that come with e-mail. Most e-mail archiving packages move data from the e-mail system and store it in an external, compressed and indexed format for fast searching. Many also have special functions for handling e-mail attachments. Most backup software vendors have products for archiving e-mail data. However, data retention might be better if kept separate from regular backup processes.
Unlike primary e-mail storage, which can use just about any kind of storage, secondary e-mail storage needs to have safeguards built in to ensure that archived data isn't deleted or tampered with. Network Appliance Inc. (NetApp) has a software function called SnapLock that gives its filer products write once, ready many (WORM) capabilities. Likewise, IBM Corp. recently introduced a new server called the TotalStorage Data Retention 450, which provides WORM storage and works with Tivoli Storage Manager for data retention software to automate data retention policies. EMC Corp.'s Centera, using content-addressed storage (CAS) technology provides a similar data retention functionality. Keep in mind that these are data center products with relatively high price tags.
In lieu of buying an expensive special-purpose data retention storage subsystem, it's possible to use network-attached storage (NAS) to store the archived e-mail data. The risk in this is the possibility that an administrator or user will delete or alter the data after it has been safely archived. This archiving-on-the-cheap approach can work in practice as long as it's teamed with back up operations that make additional copies to tape. Be warned, however, that your manual process may come under the analysis of corporate auditors who might not be convinced of its effectiveness when compared to a more automated product.
If you use the NAS for e-mail archives, you should use removable WORM media to back up your archives. As WORM prevents data from being overwritten and the NAS system doesn't, the NAS system should probably be viewed as a temporary storage location for archiving purposes. It's fine to keep archived e-mail data on a NAS system, but you must regularly back up to WORM media. Sony Electronics Inc. has recently announced tape drives that provide WORM capabilities that can operate in tape libraries. Optical jukeboxes with WORM drives are another option. A new format called ultra density optical (UDO), which is based on blue-laser technology, may give magneto-optical (MO) drives the capacity necessary to be useful for e-mail archiving.
This was first published in July 2004