This article can also be found in the Premium Editorial Download "Storage magazine: Tips for unifying storage management."
Download it now to read this article plus other related content.
|Attachstor at a glance|
The AttachStor solution
These are the problems that AttachStor sets out to solve, and its solution is relatively simple. It pulls the attachments out of the Exchange or Notes information store, and stores them in a separate database that supports true single instance store and delta-level incremental storage for all attachments. There's an Outlook and Notes client that integrates with AttachStor to provide more functionality to remote users.
Then there's the AttachStor server, which is typically installed on a dedicated server. It uses a Messaging Application Program Interface (MAPI), a Microsoft program interface that enables you to send e-mail from a Windows application, attach a document to the e-mail to communicate with Exchange or Domino and extract the attachments from the information store. This means that if it's installed on a separate server--which is how it is normally done--no software needs to be installed on the Exchange server.
AttachStor extracts, compresses and encrypts each attachment, stores them in its own file system, then replaces it with a 1K HTML file, or what AttachStor calls a Storfile. A Storfile is a special file that is used as a pointer to the real attachments in the AttachStor file system. Attachments with the same name are compared to verify that they are indeed the same file. If they are, then only one instance of that file will be stored in the AttachStor file system, regardless of how many messages or information stores it was used in. So, if you have multiple information stores on a server or multiple Exchange or Domino servers, it will provide true single instance store for a given file across all of those servers.
If AttachStor finds that a given attachment is a modified version of another attachment, it will store the bit-level differences between the two files. For example, suppose you had a 10MB attachment that is actually a revision of another 10MB file stored in the information store, but only 1MB of that file has changed. In that case, the 1MB differences would be stored in the AttachStor file system, significantly reducing the space required for versions of other attachments.
However, there's a slight downside to extracting all attachments. Once an attachment is extracted, an Outlook or Domino client user that doesn't have the AttachStor plug-in will have to click twice to receive that attachment. That is, they will receive an attachment they will have to open containing an HTML link to the real attachment in the AttachStor file system. While this minor issue doesn't apply to Outlook/Domino users that have the plug-in, some organizations won't deploy the plug-in on every desktop/laptop because of the additional cost.
Some companies say this inconvenience is worth it, compared to the amount of space they save by extracting all attachments. Other companies extract only attachments older than 30 days, leaving their most recent attachments in the information store. The AttachStor analysis tool breaks down attachments by age, so storage admins can decide what's the best policy for their organization.
If a user is running the AttachStor client, the movement of these files is invisible. When a user requests an attachment that would normally be in the information store, AttachStor automatically retrieves that attachment from the AttachStor database, without having to click twice. If the file that the user requested is a revision of another file, AttachStor merges the original file stored in the AttachStor database with the changes found in the Storfile, thereby presenting to the client the complete, revised file.
The user's desktop/laptop running the client has its own local cache of received attachments. If the user receives an attachment that is a duplicate of another attachment, the AttachStor client tells the AttachStor server that it has it, and the second attachment would not need to be downloaded to the user again. Even better, if the user receives a later revision of an attachment that's already in a local cache, the AttachStor server knows that and will send the user a Storfile that contains only the bit-level differences between the two versions. Then, when the user opens the attachment, the AttachStor client automatically merges the differences with the original that is in the client cache.
Imagine being able to send a 20MB PowerPoint file back and forth between remote users who are connected to the Internet via dial-up. If users are doing a POP3 or IMAP session, all they're downloading in their initial session is the message with the Storfile attached, and the Storfile is small.
This was first published in February 2004