1xpert - Fotolia
All-flash arrays are quickly working their way into data centers of all sizes. Their primary appeal is the performance they promise. But all-flash arrays were not widely adopted until deduplication and compression started to become standard features. But all-flash array vendors can't stop there.
While many hope for an all-flash data center, the reality is that hard drive-based storage systems will exist for a long time in most data centers. This means all-flash vendors should be aggressively looking to provide seamless data movement as their next critical feature.
We don't need another ILM
As all-flash arrays become more prevalent in the data center, the role of hard disk arrays will change. They will become increasingly capacity-focused and used for protection copies or even inactive data. Their features will evolve to focus on data retention, data integrity and data discovery. The challenge is how storage managers will migrate data to and from these systems.
Almost a decade ago, there was a lot of focus around a set of technologies that would deliver what was known as information lifecycle management (ILM). It failed for a variety of reasons:
- The price difference between high-capacity disk and high-performance disk was relatively small.
- ILM tools were primarily a bolt-on process, and integration with operating systems and applications was a challenge.
- IT staffs were stretched too thin to effectively implement and manage these environments, since data owners were largely unable or unwilling to keep data classification current.
While IT organizations are still understaffed, there is a significant price difference between flash storage and high-capacity hard disk drives. And this difference is increasing -- flash is not closing the gap with high-capacity drives. As a result, there is a price motivation to implement an intelligent tiering strategy. Archive products have also evolved significantly in the last 10 years, and vendors are beginning to integrate these data movement capabilities into their systems. This level of automation makes data movement more seamless and manageable for IT staffs.
Data movement issues
The problem with most current data movement technologies is that movement is typically within the same system, often found in hybrid arrays that mix flash with hard disk. These systems automatically move data between flash tiers. I've even seen, and believe there will be more, systems that move data between different types of flash storage.
But data movement must occur between systems, not just within the same system. This allows the primary storage system, which will be all-flash, to be performance-focused without concern about capacity. Similarly, the hard drive system can be designed for capacity, cost reduction and sophisticated data retention capabilities.
Forms of cross-system data movement
This cross-system data movement can come in three forms:
Virtual machine (VM) migration across storage systems. This is the easiest form and it is available today. Its advantage is that the movement of the VM is storage-system agnostic. The all-flash array can be from a new startup, while the storage system could be a legacy storage array or a purpose-built capacity array.
The problem with this option is that the capability, for now, is very hypervisor-specific. The data being moved and the storage systems involved would have to first be virtualized and then all be under the control of one hypervisor. Since most studies show companies employing mixed hypervisor environments and a continuation of dedicated servers for some applications, this approach for data movement may not scale.
Storage system level. With this capability, the data from storage system A can be seamlessly migrated to storage system B. This approach provides both application and hypervisor independence, but typically requires that systems A and B come from the same vendor. Also, only a few vendors have a capability like this, although many software-defined storage products should be able to provide vendor-agnostic volume movement.
These types of cross-data movement also typically have to migrate data at the LUN or volume level. For many environments, this lack of sub-volume granularity is not as big a problem as you might think. However, it can be a problem for certain environments, such as large unstructured data pools.
Unstructured data. This is important for organizations that need to have their file systems on flash storage for a period of time. The increased granularity does create some overhead and management issues, so these organizations could be better off looking for a NAS product that can integrate flash, disk and potentially tape or cloud. Again, if the data movement is built into the file system the results are generally more seamless and create less IT operational overhead.
The all-flash challenge
The challenge for all-flash vendors is to deliver one of these forms of data movement as soon as possible. But this would require two things:
- They would have to admit that hard disks will exist in the data center.
- They would need to develop and implement the feature. For all-flash vendors that are all-flash only, this probably means developing code that will allow platform-agnostic data movement.
It may seem counterproductive for all-flash vendors to develop this technology, as they would rather you just buy more flash capacity. However, I think vendors will find that all-flash array adoption will increase if they provide customers with a seamless way to move data. And users will be able to more fully utilize their flash arrays by moving data in and out of the system at will.
About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.
How data movement differs from HSM, ILM
Choosing the right automated tiering option