This is the last installment of a four-part series on block-based storage virtualization technology. In the first story, we explain the reasons why IT departments would want to implement storage virtualization. In the second story, we discuss how it’s implemented at the server level. In the third story, we explain how it’s implemented in the storage array. In this story, we examine network-based storage virtualization appliances.
Since most storage virtualization use cases typically involve some measure of storage consolidation, putting the storage virtualization engine onto the network makes a lot of sense, since other storage systems are typically connected to the network as well. As a primarily software solution, network-based implementations can be run on dedicated commodity hardware or embedded into switches. This removes the processing overhead from the host CPU and eliminates the need to purchase a new array just to get heterogeneous storage virtualization.
A network-based storage virtualization appliance can be either in-band or out-of-band, referring to whether the virtualization engine is in the data path or not. Out-of-band solutions typically run on the server or on an appliance in the network and essentially process the control traffic, routing I/O requests to the proper physical locations, but don’t handle data traffic. This can result in less latency than in-band storage virtualization, and since data isn’t handled by the virtualization engine, it’s also less disruptive if the virtualization engine goes down.
In-band solutions intercept I/O requests from hosts, map them to the physical storage locations and regenerate those I/O requests to the storage systems on the back end. They do require that the virtualization engine handle both control traffic and data, which requires the processing power and internal bandwidth to ensure that they don’t add too much latency to the I/O process for host servers.
At one point, about 10 years ago, the debate about which method was better -- in-band or out-of-band -- was a lively one. This was around the time that the information (or data) lifecycle management (ILM/DLM) concept was very popular. Storage virtualization was touted as the technology that would allow tape to be integrated into primary storage as a cost-saving method. But as has been the case with other new ideas, the relentless pressures from ever-cheaper disk storage killed ILM as a viable primary use case for storage virtualization.
The network-based implementation of this technology survived largely by being combined into other applications, such as data migration, data protection and DR. But with the abundance of CPU power and powerful commodity server hardware, the biggest disadvantages to in-band storage virtualization have been resolved. And today, the in-band storage virtualization appliance has become arguably the most popular implementation of storage virtualization technology. Let’s take a look at the vendors and products in this market.
DataCore’s SANsymphony is a software solution that runs on commodity x86 servers and supports storage devices from most major storage manufacturers via Fibre Channel, Fibre Channel over Ethernet (FCoE) or iSCSI. Multiple storage nodes can be clustered to scale capacity and to provide high availability. SANsymphony connects storage to hosts via FC or iSCSI and provides a full array of storage services, including disk pooling (consolidation), synchronous mirroring, remote replication, continuous data protection, thin provisioning, snapshots, tiered storage and file sharing.
FalconStor’s Network Storage Server (NSS) is a 2U appliance that connects to heterogeneous storage systems via iSCSI, FC or InfiniBand. Capacity expansion and high availability are provided by connecting multiple controller modules. Like DataCore’s SANsymphony, the NSS also provides a range of storage services, including synchronous mirroring, thin provisioning, WAN-optimized replication, snapshots, clones and automatic DR (physical to virtual and virtual to virtual) for physical and virtual environments.
IBM’s SAN Volume Controller (SVC) is an in-band virtualization controller that connects to heterogeneous storage systems via iSCSI or FC. As many as eight SVC nodes can be clustered to provide high availability to scale bandwidth and capacity. The system can support up to 32 PB of external storage. Each SVC node supports four internal solid-state drives (SSDs) as a cache and features replication between storage systems for DR or data migration and a mirroring function between local or remote SVC units.
When to use and how to choose
As mentioned above, most network-based storage virtualization appliances are in-band and sold as a hardware appliance or software that can be installed on commodity servers. This keeps costs down compared with array-based solutions, which require the purchase of a storage array. They’re ideal for consolidating a mixed storage environment (provided that the existing assets are compatible with the virtualization engine) and can offer the maximum flexibility, especially as a software solution.
For a medium-sized company looking for a SAN solution and the flexibility to support multiple existing arrays, for example, a storage virtualization appliance can be an excellent solution. Another use case can be for adding new functionality to an existing storage infrastructure, such as remote replication for DR or storage tiering. Similarly, a network-based virtualization solution can upgrade the feature set of an existing storage infrastructure, improving management efficiency and reducing cost per terabyte with features such as thin provisioning.
Eric Slack is a senior analyst with Storage Switzerland.