|Thin provisioning is a clever virtualization technique that separates the virtual representation of the LUN from the physical fulfillment. This allows the virtual LUN to overprovision virtual storage while providing the physical storage as needed. This technology reduces the high cost of physical overprovisioning or the practice of buying more capacity than is needed. Physical storage overprovisioning is common because most applications and operating system file systems can't discover additional storage in a LUN dynamically. Thin provisioning solves the application storage discovery issue while eliminating the premium for overprovisioning.|
As storage environments become increasingly complex and overwhelming, many organizations are being pushed to the breaking point by the sheer volume of data to be stored and managed, as well as by the increased number of regulations on how data is to be stored, retrieved and protected. Traditional ways of managing storage are proving to be either too expensive or inadequate for the job.
Second-wave storage virtualization products address the cost and complexity related to six significant problems, and can usually be cost justified based on their ability to solve one or more of these problems (see Virtualization saves money):
- Managing the volume managers of multiple homogeneous or heterogeneous servers.
- Ongoing storage acquisition.
- Provisioning multiple homogenous or heterogeneous storage arrays.
- Data protection for multiple homogenous or heterogeneous storage arrays.
- Non-disruptive or minimally disruptive data migration.
- Providing a flexible foundation for information lifecycle management (ILM).
The first wave of storage fabric-based, block virtualization products included those from Compaq/ Hewlett-Packard Co. (VersaStor), DataCore Software (SANsymphony), FalconStor Software (IPStor), StoreAge Ltd. (Storage Virtualization Manager) and StorageApps Inc. (SANLink). These products focused on simplifying storage infrastructures, management and replication while reducing the total cost of storage ownership; due to a number of technical and marketing missteps, however, they weren't able to provide a compelling value proposition for users.
|Virtualization saves money|
There are three reasons why the second wave of virtualization products is creating so much excitement. All of the tier-1 storage vendors (EMC Corp., Hitachi Data Systems and IBM Corp.) support the idea of virtualization and are providing products for the second wave. This is in direct contrast to their negativity toward first-wave products. The second impetus for the new wave comes from a new enabling block storage virtualization processing technology called Split Path Acceleration of Independent Data Streams (SPAID). SPAID eliminates most storage network block virtualization performance and scalability issues by splitting the control path (slow path) from the data path (fast path).
Most of the SPAID work has been done by a group of startups, including Aarohi Communications Inc., San Jose, CA; Astute Networks, San Diego; iVivity Inc., Norcross, GA; Troika Networks Inc., Westlake Village, CA; Maxxan Systems Inc., San Jose, CA; and Maranti Inc., San Diego. Brocade Communications Systems Inc. (via its Rhapsody acquisition), Cisco Systems Inc. and McData Corp. (working with Aarohi) have also developed SPAID products. Aristos Logic Corp., Foothill Ranch, CA, iStor Networks, Irvine, CA, and iVivity have developed system on a chip (SOC) processors.
The third reason for the new wave is that this time, storage network block virtualization isn't an end unto itself. The primary lesson learned from first-wave products is that block storage virtualization is simply an enabling technology. It must be leveraged by storage apps to provide user value. That lesson has been assimilated and the second wave is focused around complete solutions that solve real, urgent user problems.
Tier-1 storage vendors change their tune
IBM was the first of the tier-1 storage vendors to offer a second-wave product with its SAN Volume Controller (SVC), previously code-named "Loadstone." SVC uses an in-band storage network block virtualization approach. It bundles volume management with local and remote mirroring, replication and snapshot in a Lintel (IBM xSeries) storage area network (SAN) appliance. It's also available as a blade for Cisco MDS director-class SAN switches.
SVC doesn't leverage the new SPAID architectures. Nor does it address concerns with previous in-band products, like performance, scalability and reliability. So how does SVC differ from earlier products from DataCore, FalconStor and StorageApps? In a word: emphasis. SVC isn't sold or positioned as a virtualization engine. Instead, it's sold as a SAN-based volume manager and data protection storage appliance. IBM makes it clear this product is primarily a small- to medium-sized business (SMB) to small- to medium-sized enterprise (SME) product with a heavy emphasis on the "M." Interestingly, FalconStor and DataCore products are being similarly positioned.
SVC won't be IBM's only offering in the second wave. With the recent release of TotalStorage DS8000 (see IBM's new arrays), IBM is using its Power 5 chip with Hypervisor. Hypervisor allows IBM to virtualize the Power5 into multiple copies of its OS or another OS, which will allow IBM to run SVC directly on its storage array in the future. This should address many of the concerns about in-band performance, scalability and reliability, and allow the array to scale beyond its own back end. IBM is also rumored to be working on an out-of-band, second-wave offering in partnership with Cisco and Incipient Inc., Waltham, MA.
EMC's second-wave approach addresses performance, scalability and reliability by using a modified out-of-band technology in its yet-unreleased Storage Router product. Storage Router is designed to eliminate out-of-band, server-based agents by moving them into an intelligent switch (Brocade, Cisco or McData). This approach leverages SPAID architectures that split the writes at the switch. Storage Router also takes advantage of EMC's proven software for local and remote mirroring, replication, snapshot and data migration. Storage Router puts this highly regarded software into an appliance within the storage network fabric. The initial release of Storage Router is scheduled for the second quarter of 2005, but it probably won't have all of the planned functionality until a later release.
Hitachi's second-wave, storage network block virtualization offering is an optional feature of its high-end storage array, the TagmaStore Universal Storage Platform (see HDS reinvents high-end arrays). The virtualization is embedded into TagmaStore's controller architecture and is extended to other external storage systems (from Hitachi and other vendors) by connecting to them over Fibre Channel. The external storage systems see TagmaStore as just another server. TagmaStore assigns the external storage (LUNs) to its own host storage domain and logical address space. The server applications are connected directly to a cache image in TagmaStore.
Once external storage is virtualized within the TagmaStore array, additional TagmaStore storage capabilities can be utilized with that storage (with an ensuing release planned for the first half of 2005). Capabilities include high-speed global cache, ShadowImage In-System Replication, TrueCopy, remote replication, volume migration, Universal Replicator and Data Retention Utility.
TagmaStore virtualization doesn't require any appliances, intelligent switches or switch-based application blades. It leverages the powerful TagmaStore controller architecture to provide the performance, scalability and reliability required. In pragmatic terms, TagmaStore relies on faster and more processing, plus more cache to overcome the limitations of in-band block virtualization. Whereas EMC is using a modified out-of-band approach to eliminate out-of-band limitations, Hitachi is going with a modified in-band approach to do the same thing.
|In-band vs. out-of-band virtualization|
SPAID was developed with the intent of eliminating control path performance bottlenecks. But there's little software that takes advantage of this new architecture. At the end of 2004, the only generally available software using SPAID was StoreAge's SVM. But many vendors are working on software that will leverage part or all of the SPAID architecture. This changeover is analogous to events in the high-performance computing market: Software had to be rewritten to accommodate the architectural change from single-threaded monolithic designs to massively parallel architectures.
Troika Networks was the first to develop a SPAID ASIC in its Accelera intelligent switch appliance, bundled with storage services from StoreAge. Those services include heterogeneous storage volume management, local and remote mirroring, replication, snapshot and data migration. Troika eliminated application server-based agents typically required for out-of-band storage network block virtualization. It worked with StoreAge and developed multipath fabric agents for the majority of operating systems.
Cisco has been working with EMC, IBM, Incipient and Veritas to provide complete SPAID solutions tied to its MDS director switches. Cisco's MDS combined with IBM's in-band SVC is available today, and has been installed at approximately 1,000 sites. Cisco has an active SPAID-based ISV program and products, including EMC's Storage Router, should be available by mid-year.
Crossroads Systems Inc., Austin, TX, has been working hand-in-hand with iVivity to deliver a SPAID-enabled iSCSI-to-SCSI gateway appliance and board that should be released soon. The unique programmability of iVivity's iDisx ASIC gives Crossroads the flexibility and performance for tape drive virtualization, making tape drives easily sharable among many servers.
Maxxan provides complete solutions in both appliances and intelligent switches. All of its current virtualization products are in-band collaborations with FalconStor (volume management, local and remote mirroring, replication, snapshot, data migration and VTL), Microsoft (Windows Storage Server 2003) and Veritas (NetBackup.) Maxxan has tackled in-band performance, scalability and reliability issues by architecting an I/O engine (as an appliance and as a blade in its director-class switch) that has high I/O, no single point of failure, high availability and can be placed in its intelligent switch. Maxxan is working with StoreAge and others to deliver complete SPAID-optimized solutions.
Maranti has developed SPAID-optimized software for its switch for volume management, local and remote mirroring, replication, snapshot and data migration McData is working with Aarohi's FabricStream SPAID ASIC to deliver intelligent switches that will be part of EMC's Storage Router program this year. Brocade has also been working with EMC and its Fabric Application Platform using the Rhapsody SPAID ASIC to deliver an intelligent switch for the Storage Router program.
When to implement
The big question is when, what and where to implement, but there is no single, easy answer. It depends on an organization's current pain, sense of urgency and risk tolerance. If storage infrastructure management is becoming intolerable, you should implement virtualization as soon as possible. What solution to implement should be based on the answers to the following questions:
- How well does the product meet the current needs of the organization?
- Will the vendor's product roadmap (scalability, performance, functionality, etc.) match the organization's perceived future needs?
- Is the product flexible enough to meet unforeseen changes in organizational needs?
- How does the product stack up for price performance, TCO and savings vs. competing options?
- What are the product's long-term OpEx costs?
- Will the organization get locked into the supplying vendor?
- How stable and mature is the product? (How strong are the customer references?)
- How will the product be supported?
- What is the organization's risk tolerance?
Where to implement is another key issue. A second-wave solution can be implemented in an intelligent switch, in an appliance on the fabric or in the array. The answers to the previous questions should point the way to the most appropriate implementation. In general, an in-band, second-wave solution will be less scalable and provide lower performance than an out-of-band equivalent. In-band solutions should be limited to small- to medium-sized implementations; out-of-band solutions can scale from small businesses to enterprises. But some products, like Hitachi's TagmaStore, challenge this conventional wisdom. Ultimately, where a solution is implemented is less important than how well it meets an organization's current and future needs.