Less physical data living and moving over our plumbing means we can have more virtual instances of that data, which drives even more value.
Data virtualization, which is a virtual instance of data instead of a physical one, is perhaps the most interesting means of gaining efficiency throughout the entire spectrum of IT operations. But people often overlook that data is the reason why the infrastructure exists to begin with. Therefore, solving inefficiencies directly at the data layer will often have the most significant return on effort and investment. Deduplication, thin provisioning and snapshots are examples of techniques that dramatically reduce or eliminate the true physical issues associated with data, and continue to provide unfettered "virtual" access.
Consider data deduplication. Data is duplicated time and time again (pushing the capacity limits of all infrastructures and taxing operating processes), often for perfectly valid reasons. However, it has an exponential negative impact on the backup, recovery and disaster recovery (DR) processes we conduct. By default, most backups regularly create full images of data that might have been duplicated dozens of times. The positive gains of eliminating duplicate data inefficiencies can be felt throughout an entire organization.
For example, by backing up to a disk target with deduplication, IT gets an immediate primary benefit of speed, as disk enables a tremendous performance gain.
Mandated or regulatory retention requirements of an email record are fairly easy to understand. But it's a tougher task to determine the most effective and economical means of complying with this mandate. The easiest way to comply is to apply the rule to the data, put it somewhere and never move it. That way, you can always find it when you need it. But that practice is often at odds with gaining efficiency and optimization. Email is a representation of data, and during its early life it may require availability that it simply will not need after a certain period of time; for example, when it becomes a fixed (persistent) data object that will never change. It therefore makes more sense to house that email on the most cost-effective infrastructure platform attainable, which most likely isn't the originating platform. Just because retention and immutability are mandated doesn't mean that object is relegated to inefficient treatment forever.
The same benefits are realized and magnified as we apply this logic to "non-mandated" data. Every data object, regardless of form, will eventually become a persistent, non-changing asset that will be infrequently accessed. Data in this stage, which represents the overwhelming majority of corporate data being managed, has radically different attribute requirements from when it was active and dynamic. Generally, it's safe to say that whether considered as data within an archive or simply data in a lower tier of infrastructure, once it stops changing and being heavily accessed, we can apply the same efficiency logic to it. By deduplicating this data, we can more easily and efficiently protect it, access it, secure it and store it without requiring many of the superhuman efforts our IT staffs are currently forced to provide.
Less physical data living and moving over our plumbing means we can have more virtual instances of that data in even more areas of our business which, in turn, drives even more value. This is a positive cycle that feeds and leverages off a basic truth: Less is better than more.
BIO: Steve Duplessie is founder and senior analyst at Enterprise Strategy Group. See his blog at http://esgblogs.typepad.com/steves_it_rants/.
This was first published in April 2009