Those technologies were/are:
- Object model data management
- Fabric-based intelligence
- Packet/block level virtualization
The other driving factor for these initiatives is the ability to put the sales slogans we all heard in 2004 around "data lifecycle management" into practice. Sure, there was plenty of cool software around for policy based management, but how do you actually migrate data online to different storage tiers without the host being aware that it happened? The devil is in the details after all, and actually moving data around between different classes of heterogeneous storage devices is not a trivial matter. After all, there are a multitude of operating systems, applications, filter drivers, firmware versions, etc. that need to be thought about. Other things like timing differences between unlike disks, response times from various arrays with differing cache sizes, SLA requirements of applications being voided by device mixtures in storage pools (like ATA, SATA and FC disks in the same pool), and a whole host of other issues that are too numerous to list here.
The reason my timing was off in my 2004 predictions was because actually doing this stuff is hard. The good news is that in 2005, we should see more real products coming out, not just from startup companies, but from all the major vendors, that tackle these issues head on.
On the object model data management front, the software is already here, and it's getting much better than it was in the beginning of 2004. The SNIA standards for SMI-S are now beginning to take form, and you are seeing the results in products from almost all the software vendors in the storage management space. 2005 will bring better integration of products, and the ability to actually classify devices by their capabilities (thus solving the disparate devices in the pool issue). The problems are being solved by stronger partnerships between software and hardware companies to write the required CIM (Common Information Model) compliant databases needed to discover, manage and control their equipment. The operating system software vendors are also working hard to get their file systems to support more robust metadata about the information stored within the file system. With more information about the information we are storing, we are able to create better policies that affect that data.
On the fabric-based intelligence front, my predictions were almost on target. I was right in the sense that we started to see products from the switch vendors, in partnership with the storage and software vendors, that brought the ability of virtual pooling of storage devices, fabric level replication, storage appliance-based virtualization and even a couple of GFS solutions (Global File System). I was wrong in the fact that these solutions were not just fabric-based, but also controller-based. Heck, I even work for the company that introduced the "Universal Storage Platform."
In 2005, I expect we will finally see a convergence of all this intelligence. In my opinion, there is a place for intelligence at the host level, the fabric level and the storage level. Let's face it, some things are done better closer to the place that needs it the most. RAID is done better at the storage hardware level, and global file systems are done better at the host level or just below it. Storage pooling can be done at all three levels. Some companies are betting the ranch on "fabric" (core) based intelligence, and others on "controller" (edge) based intelligence. I tend to like the solutions that provide the least latency for my data path, are open and are simple to implement. The idea is to let the end user choose which method applies best to their situation, and ultimately the market will decide in the end on which is the better approach.
On the packet/block level virtualization front, I think I hit the mark with my 2004 predictions. Blade-based core SAN switches really made an impact in 2004. We saw products from the switch vendors that included SAN interconnection protocols such as FC-IP and iFCP included as intelligent blades within the switch itself. iSCSI was also introduced as a native switch protocol, and in the higher end storage arrays and high end NAS appliances, we saw the inclusion of not only FCP, but iSCSI, CIFS, NFS, and even native IP for WAN connections for data replications. Some vendors also included native DAFS (Direct Access File System) protocol capabilities.
All of these advancements have been making possible what I have been calling "programmable storage," but I think the real buzz in 2005 will be based on what the programmable storage paradigm makes possible, and that is "throw away storage."
We have all heard the term "utility computing." In fact, you can go to any of the major server vendors today and just buy "compute cycles" based on what you normally use to run your business applications month by month. This is just like the way we buy electricity. Are electrical power plants complex and expensive and a real pain when things go wrong? Sure. (Take the east coast blackout last summer for instance!) Do need to know how it all works when you plug in an appliance? Nope. You most likely only care when you get your electric bill, and even then, only if the bill was too high.
This is the same concept we are getting to in the storage industry. I'm not talking about outsourcing storage to some SSP (storage service provider), although that could be an option. I'm talking about the storage infrastructure getting very complex, but managed in a way through intelligent software and devices that allow you to take advantage of the economics of the electronics industry, which is one of the only places where as things get better, they also get cheaper.
The idea is to use inexpensive storage in conjunction with very reliable storage as backup, and layer on top of that the intelligence required to make data available to any host that needs it from any location (perhaps using distributed and virtualized file systems), to provide a cost effective storage utility. If a something breaks somewhere, just throw it out and replace it with a new one, just like a light bulb. The data is already automatically backed up to several places. Once the device is replaced, the infrastructure heals itself, automatically. User policies control everything. The real expensive storage is also a part of this environment, but is only used by the data that requires it by SLA or performance. Legacy mainframe application data can be an example here.
In 2005, you will hear much more buzz about DAFS, Infiniband, RDMA, exotic file systems, parallel NFS and storage grids. These are some of the technologies that will help glue the infrastructure together.
What I said in summary in 2004 still applies today:
"The combination of intelligent fabrics, file systems, storage arrays and application aware management software will let you create 'programmable storage'. XML metadata tags that describe data can be stored with it. Policies can be created to manage different data types. Storage can be classified by properties, such as performance or reliability. Methods than can be invoked through the management software (hardware based snapshots will be a method in arrays).
"The advent of programmable storage will enable a complete paradigm shift in how IT departments handle data resources. The hard work (which is policy definition and creation) needs to be done by you, the data creator. So start defining your policies today. Begin by benchmarking your storage arrays to get the performance metrics that can be included in your policies for data placement. You will need to take a closer look at your current business processes, to make sure they tie in seamlessly with your data management policies. The end result is a completely automated data center that can be monitored via a single console, and managed by your applications needs rather than the abilities of specific components."
Happy New Year!
About the author: Christopher Poelker is a storage architect at Hitachi Data Systems. Prior to Hitachi, Chris was a lead storage architect/senior systems architect for Compaq Computer, Inc., in New York. While at Compaq, Chris built the sales/service engagement model for Compaq StorageWorks, and trained most of the company's VAR's, Channel's and Compaq ES/PS contacts on StorageWorks. Chris' certifications include: MCSE, MCT (Microsoft Trainer), MASE (Compaq Master ASE Storage Architect), and A+ certified (PC Technician).
This was first published in December 2004