Centralize virtualization at the switch

There are a number of ways you can virtualize your storage, but because a switch-based virtualization engine works out-of-band, there's no need for server agents, making it the most scalable and highest performing of all virtualization architectures.

This Content Component encountered an error
This article can also be found in the Premium Editorial Download: Storage magazine: Using two midrange backup apps at once:

Storage virtualization can reside in the fabric switch, an appliance or in the array's controller. Each architecture has its pros and cons.

Fabric-based virtualization products haven't been adopted as quickly as inline virtualization appliances like IBM Corp.'s SAN Volume Controller (SVC), but they're one of the most promising storage virtualization solutions. "Fabric-based virtualization is the best technical approach to storage virtualization but it's not taking off," says Jim DeCaires, storage product marketing manager at Fujitsu.

Switch-based virtualization brings many benefits to the SAN fabric. Because the switch-based virtualization engine is out-of-band (out of the data path), there's no need for server agents, and it's the most scalable and highest performing of all virtualization architectures.

Storage virtualization sends data to physical arrays, from single or multiple vendors, as a single storage pool with the following benefit: storage can be managed as if it was on a single array, from provisioning to advanced features like replication, snapshots and mirroring between the arrays in the pool. To accomplish this, storage virtualization products map virtual volumes to physical devices; whenever a virtualized storage resource is accessed, the virtualization layer translates and redirects storage requests to the designated physical storage according to a mapping table.

With three primary storage virtualization architectures--in-band appliances, storage controller-based and fabric-based--the location of where virtualization should occur has been hotly debated. Each approach has its pros and cons.


In-band appliances
These products, like switch-based virtualization products, perform virtualization within the network. They're located between arrays and servers, and all storage traffic needs to pass through them. While fabric-based virtualization uses wire-speed switching to map and forward storage frames, in-band virtualization appliances require terminating incoming I/Os and initiating new I/Os to the actual storage target based on the information in the mapping table.

"The process of terminating, re-initiating and verifying I/Os adds significant latency to I/O processing," says Brian Garrett, technical director of Milford, MA-based Enterprise Strategy Group's ESG Lab. To compensate for the overhead and performance penalty of having to spawn new I/Os, products like IBM's SVC depend on cache, which adds the complexity of ensuring data integrity and data consistency in the cache, a problem switch-based virtualization products don't have.

IBM SVC is the most prominent product in this category and, through scalable cluster configurations and plenty of cache, it has greatly reduced the performance and scalability concerns that have plagued in-band virtualization products in the past. Relatively low cost, simplicity and a rich feature set have greatly contributed to in-band virtualization being the most widely deployed virtualization architecture today.

"In-band virtualization products like IBM's SVC or DataCore Software Corp.'s SANsymphony have the lowest entry cost; unlike fabric-based virtualization products, they don't require expensive intelligent switches," explains Greg Schulz, founder and senior analyst at StorageIO Group, a technology analyst and consulting firm in Stillwater, MN. Because products like IBM SVC work with any switch, in-band virtualization appliances have another advantage over fabric-based products like EMC Corp.'s Invista, which only runs on supported switch platforms.

Storage controller-based virtualization
This architecture, championed by Hitachi Data Systems and used in its Universal Storage Platform V (USP V) storage systems, performs virtualization within the storage controller of the array. A non-Hitachi array can be virtualized by simply plugging it into a Fibre Channel (FC) port on the USP V. To third-party arrays, the USP V presents itself as a Windows server; once the third-party array is discovered by the USP V, it appears to other servers as a Hitachi array. Unlike switch-based virtualization, for companies that have standardized on Hitachi storage and already own USP arrays, the effort to enable virtualization is miniscule and relatively inexpensive. "About 50% of our USP V customers purchase a virtualization license, and with 9,200 USP V units sold in the past three and a half years, we have a significantly higher number of virtualization deployments than all fabric-based installations combined," claims Claus Mikkelsen, chief scientist at Hitachi Data Systems.

For users who are using or have standardized on array-based virtualization, vendor lock-in is high, even more so than for fabric-based virtualization. "You wouldn't buy a Hitachi USP V for virtualization if you're an EMC or NetApp shop; but USP V virtualization would be on top of your list if you had standardized on Hitachi storage," says Schulz.

Having the array and virtualization software from a single vendor has the huge benefit of a single point of support. In stark contrast, fabric-based virtualization products, namely those from EMC and Incipient Inc., require the orchestration of three different vendors (array, switch and virtualization software vendors), which clearly carries the risk of finger pointing if problems arise.

Fabric-based virtualization
Fabric-based products map virtual storage to physical storage within the network, more specifically within an FC switch or director. Unlike network-based virtualization appliances like IBM's SVC, switch-based virtualization is typically implemented as a split-path architecture where the data and control paths are separate. With the virtualization logic running outside of the data path, I/Os pass directly through the switch without the speed bump of in-band solutions like IBM's SVC. The control-path software typically runs on CPUs within the switch, and it only gets involved if I/Os need to be redirected to instruct the switch where to route storage requests.

"In a split-path virtualization architecture, 90+% of the requests pass through the switch at wire speed; only if something special like migrating of data needs to be performed [does] the control-path controller get involved," explains StorageIO Group's Schulz. The separation of the data path and control path, combined with the low latency of switching for translating and forwarding virtualized storage requests, makes fabric-based virtualization the best performing and most scalable virtualization architecture today.

On the downside, switch-based virtualization has the highest level of vendor lock-in of all virtualization approaches. Because the switch is used as the platform to run the virtualization software, it becomes very difficult for users to change switch vendors if they choose to do so. Furthermore, as intelligent switches turn into multitasking platforms, concurrent storage services from the switch vendor and third parties make supporting these switches more challenging.

As long as there are no problems, it's a great concept. But if there are problems with the virtualization software or any of the third-party storage services, the concerted effort of all involved parties may be required. Besides the relatively high cost of intelligent switches, increased complexity and more challenging technical support are among the contributing factors for the cautious adoption of fabric-based virtualization. "In general, storage managers like to keep things simple and tend to go with more self-contained, easier-to-manage solutions like LSI [Corp.'s] StoreAge or even IBM SVC," says Nelson Nahum, who was CTO at StoreAge before it was acquired by LSI.

Without question, the low latency of fabric-based virtualization is a big plus, but while it eliminates the use of cache, there is a downside: Virtualization solutions with cache, like IBM's SVC and Hitachi's USP V, use that cache to increase performance of the back-end storage. As a result, virtualization products with cache encourage the use of lower cost, lower performing storage tiers, with the cache boosting access performance. While the low latency of switch-based virtualization products is great for accessing fast arrays, its lack of cache actually turns into a disadvantage for accessing lower performance arrays. "In switch-based virtualization, back-end disk performance shows unmasked," says StorageIO's Schulz.

A second and more profound implication of the stateless nature of switch-based virtualization is the more challenging support of virtualization applications that require information beyond the mapping information. Features like remote replication and thin provisioning require memory to maintain certain state information. For instance, for a 2TB thin-provisioned volume using 100GB of physical storage, information about which 100GB are actually used needs to be maintained. While products like IBM's SVC and Hitachi's USP V maintain this information in memory along with the cache, switch-based virtualization products don't have the luxury of cache memory and their only option is maintaining this information on the SAN. "There's no complete solution for remote snapshots, remote mirroring and thin provisioning in switch-based virtualization products today because they're very difficult to implement without cache," says Fujitsu's DeCaires.


Virtualization approaches compared

Click here for a comparison of
virtualization approaches (PDF).

Switch-based platforms
Fabric-based virtualization products are offered by the following vendors.

EMC Invista: Invista is the most prominent fabric-based virtualization product. While other vendors support only a single switch platform, Invista runs on Cisco Systems Inc.'s MDS, as well as Brocade switches and directors. On Cisco switches, Invista requires and runs on the Cisco MDS 9000 Storage Services Module (SSM), which provides 32 FC ports with embedded ASICs that perform the mapping and wire-speed switching of virtualized storage requests.

Invista runs on Brocade's 7600 Application Platform, available as a switch blade and a standalone appliance, with 16 FC ports with embedded ASICs. For reads and writes, the ASICs in the Cisco and Brocade modules look up the virtualization mapping information from the mapping table in memory and forward frames through the applicable FC port to the target at wire speed without the need of the CPUs on the intelligent switch module becoming involved. The control path of Invista consists of virtualization software running on the Cisco SSM or Brocade 7600, as well as the Data Path Controller (DPC) appliance. The virtualization software on the switch communicates with the DPC to receive information such as virtual disk configuration and directions for copy functions. Like all fabric-based virtualization products, Invista passes commands from the external DPC appliance to the intelligent fabric using the Fabric Application Interface Standard (FAIS) protocol.

EMC RecoverPoint: RecoverPoint is another fabric-based virtualization product that complements Invista for those customers who need remote replication or continuous data protection (CDP). In 2006, EMC acquired Kashya and subsequently released it as RecoverPoint. While Invista attempts to address a range of virtualization tasks, RecoverPoint's sole focus is on remote replication and remote site incremental snapshots via the underlying CDP engine. Invista's lack of remote replication prior to RecoverPoint is an example of the challenges fabric-based virtualization vendors face in adding features that require state information beyond the virtualization mapping table.

Fujitsu Eternus VS900: Similar to the Incipient iNSP, the Fujitsu Eternus VS900 doesn't depend on external control path appliances. An external management server is used only to upload and change configurations as well as for monitoring, but it isn't required to communicate with the virtualization software on the switch during normal operation. The Eternus VS900 also continues to operate properly even if the management server is unavailable.

The Eternus VS900 currently works only on Brocade switches. "It was developed as a collaborative effort between Brocade and Fujitsu, similar to what EMC has done with Cisco," explains Fujitsu's DeCaires. Like Invista and Incipient iNSP, the Eternus VS900 currently lacks advanced storage features like remote replication and thin provisioning.

Incipient Network Storage Platform (iNSP): Incipient iNSP is very similar to EMC's Invista with a few distinct differences. First, Incipient only supports Cisco MDS switches and directors. Data path processing is identical to that of EMC, except Incipient calls it FastPath Processor. The most significant difference to that of EMC is that all virtualization software runs within the Cisco SSM module and isn't split to have dependent code running on external appliances. This eliminates dependencies outside of the switch, making it an overall less-complex solution.

LSI StoreAge Storage Virtualization Manager (SVM): EMC, Fujitsu and Incipient virtualization products all run on intelligent FC switches. As a result, these products are expensive to deploy; bind virtualization to a switch vendor, which creates vendor lock-in; and increase the complexity of the SAN. LSI acknowledges the benefits of fabric-based virtualization, but realized early on that these disadvantages would hamper acceptance. Through two acquisitions--Storage Virtualization Manager (SVM) from StoreAge, which is the control-path virtualization software; and the LSI 8400 data-path fabric hardware from QLogic (which acquired it from Troika)--LSI can offer a virtualization solution that combines the simplicity of in-band appliances with benefits of fabric-based virtualization.

The LSI 8400 provides the data path and control path but, unlike Cisco and Brocade switches, it only provides the switching features for virtualization. This makes the LSI 8400 more cost-effective and complements existing FC switches, rather than replacing them. "The 8400 is a virtualization appliance with switching capabilities, but it's not a switch with the huge benefit that we can connect to any switch," explains LSI's Nahum.

From an implementation perspective, the LSI 8400 gets connected to an existing FC switch and the 16 switch ports become part of two zones--one contains initiator ports, while a second contains target ports. When a server accesses a virtualized volume, traffic is forwarded to the designated target port on the 8400 through a standard FC switch. The 8400 then performs the virtualization lookup and forwards frames to the appropriate storage device through one of its initiator ports. As the LSI 8400 connects through other FC switches, it adds two hops, because a standard FC switch forwards traffic to an LSI target port and then receives frames from an LSI initiator port; however, the added latency is negligible.


A sampling of virtualization products

Click here for a sampling of
virtualization products (PDF).

Virtualization applications
Storage virtualization is employed to solve specific business problems, such as simplified management, cost reduction or a need for nondisruptive data migration between arrays. "No one buys storage virtualization for virtualization per se, but to fill a very specific need; this is very different from server virtualization," says Robert Infantino, Incipient's senior VP of marketing and alliances.

Data migration
Data migration is by far the leading reason users deploy storage virtualization. As the virtualization layer controls the virtual to physical mapping, a virtualization product is able to forward storage requests to the correct physical device even while data is migrated and spread between a source and destination device. All virtualization products, regardless of their underlying architecture, support data migration. Performance of data migration services, however, will vary among the different virtualization architectures; Hitachi claims to have a performance advantage because its virtualization resides within the storage controller.

Provisioning and volume management
Provisioning of virtualized volumes is a core service in all virtualization products. Besides simplified storage management, centralized provisioning through the virtualization software enables higher storage utilization because storage is provisioned more granularly. Similarly, the volume management feature in virtualization products is instrumental in increasing storage utilization. By aggregating multiple physical disks to present them as a single large disk, and disaggregating large disks to present them as multiple smaller volumes, storage can be managed more effectively.

Thin provisioning
Thin provisioning is currently not available in fabric-based virtualization products. The lack of cache and the stateless nature of fabric-based virtualization make it more difficult to implement. Hitachi was the first vendor to offer thin provisioning in its USP V product, offering the first 10TB of thin-provisioned storage for free. IBM added thin provisioning in the recently released SVC 4.3 along with space-efficient snapshots and virtual disk mirroring. LSI has committed to releasing thin provisioning later this year and Incipient has it on its roadmap; EMC and Fujitsu have no intention at this point to offer thin provisioning in their virtualization products. "We currently support thin provisioning in our arrays and haven't decided if and when we will support it in Invista," says Doc D'Errico, VP of the infrastructure software group at EMC.

Snapshots
Snapshots or clones are supported by all virtualization vendors except Fujitsu. However, space-efficient snapshots (snapshots that require disk space for changes between snapshots) are currently supported only by IBM's SVC, Hitachi's USP V and LSI's SVM. Akin to thin provisioning, space-efficient snapshots are more difficult to support in fabric-based virtualization products. While EMC's Invista supports only full-copy clones, users have the option to deploy RecoverPoint in addition to Invista to take advantage of space-efficient snapshots. LSI's virtualization proves that the app challenges of fabric-based virtualization can be overcome; not only does LSI offer space-efficient snapshots, its SVM product supports consistency groups with snapshots, as well as copy and mirroring, which enables entire apps to be snapped or copied at once and then recovered within minutes.

The ability to scale
Performance and scalability are the two main benefits of fabric-based virtualization. Switch-based virtualization products can be scaled vertically if the switch vendor supports multiple intelligent fabric modules in a single switch and horizontally by deploying additional intelligent switches. For example, storage architects can populate a single Cisco MDS switch or director with multiple intelligent line cards, or intelligent line cards can be inserted into multiple switches. Scaling is also achieved by deploying newer generations of intelligent line cards. "Cisco is already shipping the MSM-18/4, which is Cisco's second-generation intelligent card; in the fourth quarter of this year, Cisco will ship its third-generation intelligent line cards," says Rajeev Bhardwaj, director of product management at Cisco's Data Center Business Group.

There's currently no perfect virtualization product and users need to carefully weigh the product that best fits their environment's requirements. For companies that have standardized on Hitachi storage, Hitachi's array-based virtualization is likely at the top of their list. Companies with a Cisco- or Brocade-based SAN should consider Invista or Incipient; they'll also need to weigh the performance and scalability benefits against some of the challenges of these products such as their relatively high cost, complexity and feature constraints. LSI deserves consideration as it's currently the only vendor offering a virtualization product that combines the simplicity and rich feature set of an in-band appliance like IBM's SVC with the benefits of fabric-based virtualization.

This was first published in September 2008
This Content Component encountered an error

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchSolidStateStorage

SearchVirtualStorage

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

Close