I hear a lot about storage virtualization. There are three kinds, host-based, storage-based and switch-based virtualization (in- and out-of- band) techniques.
Can you tell me the real differences between the three approaches? What kind of host-based products are there? Which disk arrays do you consider as storage-based virtualization products and do you find in-band a better approach then out-of-band.
I'm looking forward to your answer.
Great question. Block storage virtualization can occur anywhere in the I/O path after the file system has resolved the data into block locations. This can happen in the host (volume managers) in the subsystem (RAID controllers) or in a network device of some sort (SAN virtualization).
Here's my take: Volume managers and RAID controllers (even mirroring controllers) have been around a long time. They work and I tell anybody who will listen that these are perfectly good virtualization implementations.
Volume management license tend to be expensive and limited to a single host. RAID controllers come in a wide variety of sizes and prices but they do not extend beyond a single limited subsystem's storage capacity.
Hence, the excitement about SAN virtualization. I'm a fan of out-of- band virtualization and not a big fan of in-band virtualization. Out-of-band virtualization places the virtualization "lens" in the host where it has the most leverage over the entire network. However, the management and the deployment of the virtualization software is controlled by a SAN entity. It's like having system resident volume managers but without the host-based licensing and with centralized deployment and control in the SAN. In-band virtualization can work with some number of host systems and some number of subsystems. The capacity is not clear to me but it's certainly not infinite. The big problem with in-band virtualization is the addition of an extra hop between servers and storage. Even in a switch, the I/O must be processed by an internal function that has its own network address. The I/O terminates and is processed and retransmitted again. This is a process that introduces latency and could fail. There is no way that in-band virtualization does not add latency. Some would say that caching can overcome this - but caching is application dependent and there is no way of knowing what the efficiency of the virtualization engine's cache would be.
By the way, the arguments about security advantage with in-band are bogus. There is no difference. All storage is exposed on the SAN the exact same way and there is no way to force a renegade system or hacker to go through a virtualization engine. Eventually there will be stronger security technologies in SANs but it will not come as a result of in-band processes. It will be established between end points in the SAN.
I believe a good strategy is to use virtualization in the host through volume management or out-of-band virtualization and to also use it in storage subsystems for redundancy and performance reasons. This delivers excellent host storage scalability with the least amount of confusion.
Editor's note: Do you agree with this expert's response? If you have more to share, post it in one of our .bphAaR2qhqA^0@/searchstorage>discussion forums.
This was first published in June 2004