If I had to pick one word that comes to mind when talking about cheap storage, it would have to be procrastination. Procrastination, or in actuality avoidance, is exactly what a lot of organizations are opting for when faced with the problem of data growth and the absence of data retention policies.
This is a common situation for small-midsized businesses (SMBs), which often deal with resource shortages and sometimes false perceptions that data growth is still manageable. The declining cost of storage has allowed companies to address the problem by throwing more hardware at it -- much like giving a spoiled kid more candy to appease repeated tantrums. But eventually, there is no candy sweet enough and concrete actions are required.
Declining costs have also allowed organizations to put off making decisions regarding data lifecycle management and data categorization, resulting in some of the following issues:
- Large file servers that take all weekend to back up and can take days, or sometimes more than a week to restore from tape.
- Backup windows that overlap business hours.
- Increased cost to meet growing data availability requirements (i.e., data backups, replication, etc.).
- Backup data storage capacity requirements that are multiples of the production data.
- Uncertain records discovery and retrieval capabilities in the event of litigation.
- The growing need to purchase data discovery and management software.
- Server room space shortage to accommodate more storage devices.
- Significant increases in power and cooling requirements.
It becomes clear that declining storage costs can quickly be overshadowed by other rising costs. This is easily observed if you consider that overall IT spending doesn't follow the same curve as storage capacity cost. Costs associated with data availability, replication, security, etc. tend to not follow the cost of basic storage and grow exponentially.
Reducing storage costs
Without trying to trivialize the effort, the first step is to start identifying what data the organization has in storage. This is an inventory effort, in which all areas of the business must participate. The storage department can not identify data much beyond reporting on file types, size, ownership, the systems or applications that access the data and the last access date. What the data is actually used for must be communicated by the end users.
Once the data is inventoried (and this is no small task), it is up to the data owners and users to indicate how critical that data is to their daily activities or if it is still needed. If the data is no longer used and can be disposed of, it should be deleted. If the data is no longer accessed but must be retained, it should be taken out of the costly daily data management loop, such as mirroring, backups, monitoring, virtualization, etc.
This is where it becomes necessary to make end users aware of the financial implications for their decision to keep data. It is too easy to avoid making decisions by taking a "you never know when you will need it" approach and keeping everything.
Many SMBs have a tendency to fail to act based on the belief that their data set is still manageable. This may be true at some point, but with new business records created daily, how long will your storage environment remain manageable?
There are a number of products available that leverage data archival or hierarchical storage management capabilities by moving data to tape or other media, but they all have one thing in common: Software will not make the decisions for you. It will help you apply policies, but it will not define them.
It must be noted that the solution is not designed to systematically reduce the amount of data stored, as there might be laws regulating its retention, but rather to reduce the amount of data that is subject to costly storage-related processing. Companies can't prevent the creation of new data or transactional records, but gains will come from making decisions about data that is no longer needed or used on a regular basis.
Of course, this shouldn't be considered a one-time exercise. To avoid making it wasted effort, the newly established policies or retention guidelines for existing data must also be applied to new data as it is being created to avoid having to repeat the same exercise down the road. And, as much as the term has been overused over the last few years, it is also the foundation for information lifecycle management.
About the author: Pierre Dorion is the Data Center Practice Director and a Senior Consultant with Long View Systems Inc. in Phoenix, AZ, specializing in the areas of Business Continuity and Disaster Recovery Planning Services and Corporate Data Protection. Over the past 10 years, he has focused primarily on the development of Recovery Strategies, IT Resilience and Recoverability as well as data protection and availability engagements at the Data Center level.
Do you have comments on this tip? Let us know.
Please let others know how useful this tip was via the rating scale below. Do you know a helpful backup tip, timesaver or workaround? Email the editors to talk about writing for SearchSMBStorage.com.