Data growth for most organizations has been steadily growing, in excess of 50% per year for several years now. This phenomenon is not exclusive to large organizations and also affects SMBs, which have and continue to experience significant data growth. In fact, SMBs sometimes see a higher percentage growth because acquisitions can sometimes instantaneously double the amount of data they have to manage.
With lower cost storage having somewhat eased the pain, the attention has shifted to data backup. Based on retention policies, it's not uncommon to have four to five times more backup data than production data. Traditional backup methods have had a hard time keeping up with the constantly increasing amount of data and decreasing recovery time objectives (RTO). Even with lower backup media cost, requisitions for more tapes and library expansions are under even more scrutiny then before.
Many businesses have found they can't afford to let data grow forever and must start addressing the issue at the source. The following options should be considered when attempting to reduce the amount of data to back up and the amount of space required for backed up data:
Understanding what data your organization generates, uses and stores is the first step in data management. It helps identify how much data there is or will be, where it is stored, how long it will be useful to the business and if it should be backed up at all. However, this task requires both time and resources and this is where many organizations fail. This is especially true for smaller organizations that simply do not have the resources to dedicate to this.
There is often a mistaken belief that smaller organizations don't have that much data, so they can defer considering categorization until capacity grows and it really becomes a problem. The truth is, SMBs with a reasonable amount of data have a golden opportunity to start getting a handle on organizing data before it becomes a problem. Like with anything else, procrastination always ends up costing more.
Yes, the infamous information lifecycle management (ILM) is still alive although no one wants to call it that anymore -- and no one is sure whose responsibility it is. Although many vendors were quick to associate the acronym with hardware and software solutions, ILM is really about making corporate decisions regarding where data is stored, and how long it should be kept.
Any data deleted or archived under this kind of management means less data on your production storage arrays, less data to back up and less data to restore. Essentially, ILM is a business problem that can only be solved collaboratively with the help of IT. Decisions regarding data retention can only be made once you have gained an understanding of what you have.
Once again, there is golden opportunity for SMBs to address this before it becomes unmanageable. Many large organizations dealing with this issue wish they had started down that path when they had a much smaller data footprint.
How long backup data is kept will have a significant impact on the backup environment. For example, organizations must seriously question how useful email messages are that are 30 or 60 days old. Remember the distinction between archive and backup. Backups protect from data loss and should not be used for long-term retention. In addition, archived data no longer needs to be backed up daily or weekly, nor does it need to be restored after a system failure.
Archives for email and file servers
AXS-One Inc. AXS-Link, CommVault Data Migrator, EMC Corp. EmailXtender IBM Corp. System Storage Archive Manager and Symantec Corp. Enterprise Vault are some of the products that allow organizations to archive application data and manage retention. Once again, data archived at the source is no longer backed up daily, thus reducing both production and backup storage utilization.
Hierachical storage management (HSM) capable products such as IBM TSM, Symantec Storage Migrator and EMC DiskXtender can also help reduce the backup and restore pains by migrating less-frequently used data off the production storage and leaving a pointer in place. Backup products integrated with these solutions will only back up or restore the pointers, thus minimizing the amount of data backed up and the duration of the backup.
Data deduplication has been one of the most refreshing advancements in the backup storage arena. Products from companies such as Avamar (now owned by EMC) , Data Domain, Diligent, NetApp NearStore and Sepaton Inc. (and the list keeps growing) can be implemented as network targets or virtual tape libraries (VTL) and, in most cases, can be fully integrated with an existing backup solution.
These products can dramatically reduce the amount of storage required for backup data by only storing unique data segments and keeping pointers for duplicate segments. Most of these vendors now offer lower capacity solutions that are suitably sized and priced for SMBs.
As much as data management may seem to mostly be an enterprise problem, SMBs should not ignore the importance of making data management part of their IT corporate and IT cultures early on. Unless your business has no plans to grow, chances are it will eventually be an enterprise with enterprise-class problems.
About this author: Pierre Dorion is the data center practice director and a senior consultant with Long View Systems Inc. in Phoenix, Ariz., specializing in the areas of business continuity and disaster recovery planning services and corporate data protection.