| Home > Storage Technology News > Virtualization: What are you getting for your money? | |
| Storage Technology News: |
|
||
Editor's Note: The following is a vendor-neutral white paper about virtualization from StorageTek. This whitepaper has been in high-demand by our readers recently within our own searchStorage Sound Off forums. We hope you find it useful. Virtualization: One of the major trends in the storage industry - What are you getting for your money? Introduction As storage growth continues to exceed 100% per year, and as heterogeneity proliferates, the complexity of managing IT infrastructures increases exponentially. The promise of virtualization is that it will significantly improve storage manageability. But unless it also delivers on cost containment for IT, virtualization is only delivering on a solution to part of the problem. This paper will define virtualization, contrast the kinds of implementations that are being announced on an almost daily basis, and will provide a basis to compare and evaluate various offerings. Storage growth, people, the economy, business and IT budgets The first reaction of most CIOs to the prospect of storage infrastructure 32 times bigger than it is today i: How will I be able to manage it with the people I have today? Given the lack of significant improvements in storage management productivity, the shortfall of IT professionals worldwide, static or shrinking budgets, massive growth, the strategic nature of information in both competitive differentiation and in implementing e-business applications, and the need to simultaneously improve availability and scalability of storage infrastructure, it must seem impossible to CIOs to find a solution to all this chaos. There is almost a nightmare element to the many variables that are coming together at one point in time. We could almost call this the Perfect (Storage) Storm. Ten years ago in 1991, in one of the rarest meteorological events of the century, three separate weather systems were on a "perfectly" aligned collision course. A Great Lakes storm system moving east, a Canadian cold front moving south, and Hurricane Grace moving northeast were all headed for the North Atlantic. Along the way, the storm would create monster seas, batter ships, and cause coastal flooding along the eastern U.S. seaboard. To IT professionals in 2001, this is the perfect storage storm. It is not as though storage growth can be slowed down to match budgets, or the economy, or even to match the ability of human beings to deal with it more comfortably. Storage is not a faucet that can be turned off. Storage growth is driven by information flow, which in turn is driven by applications created to run the business and to maintain or improve competitive positioning. E-business applications - from supply-chain management to customer relationship management and everything in between - are vital new non-discretionary elements of the post-internet business world. Storage is a non-discretionary budget item that is now consuming more than 50% of server deployment costs. So, what are the solutions to these problems? How do we survive the "perfect storage storm"? New architectures and technologies Many vendors, large and small, are talking about storage virtualization. It is clearly positioned by the majority as a means to simplify management of large, complex, heterogeneous storage environments, with the clear implications that virtualization will exist within a storage networking (typically SAN) environment. Right now, most of these announcements appear to generate more questions than they answer. What do they mean? What is being virtualized? Where is it being implemented? Is this virtualization or abstraction? Is this simple pooling of devices with a fancy name? How much of what is being announced is available today? Which server operating systems are supported today? It is reasonable to challenge many of the claims being made, but meanwhile, there is a need to clarify the "virtual" landscape. A perfect illustration of the confusion surrounding the meaning of virtualization is this story. At a recent technology conference, a lengthy panel discussion had representatives from many vendors stand up one after the other and describe their virtualization strategy and product (one vendor actually has two conflicting products). At the end of the last presentation, the discussion chairperson, himself the chief technologist at one of the vendors, asked the large audience of primarily end-users, "Is there anyone who now has a better understanding of virtualization and what it is?" Not one hand went up. After the laughter died down, it was clear that there is tremendous diversity in defining and implementing virtualization. Virtualization We believe that the scale of the problem is so large, that the goal of all the efforts surrounding storage and storage manageability should be to plan for the elimination of human intervention in storage management. Virtualization is a step in that direction and we believe that automated policy-based management algorithms and decision-making intelligence will join virtualization in the near future. Virtualization implemented within the context of a SAN contributes several things to the goal of easing storage management workloads. It hides complexity by simplifying the server's view of what devices exist. It masks change by enabling physical storage devices to be removed, upgraded or changed without the need to tell the operating system via device drivers or otherwise that the storage world is different now. It can magnify an administrator's productivity by pooling large amounts of storage and allowing that storage to be allocated across many servers via a GUI or similar interface. It can aggregate small amounts of storage across multiple devices and make it appear as a single large disk. And it can reduce cost in at least a couple of ways: by allowing aggregations of commodity storage components to be presented as something else entirely and by eliminating the under-utilization of capacity. It could be argued that some of these things aren't even virtualization, but abstraction or emulation, or aggregation. However, the point is not to argue semantics but to stimulate a critical view of virtualization offerings so that intelligent choices can be made. This paper proposes some fundamental positions:
The what and the where (and the span-of-control)- pros and cons The vast majority of recent storage virtualization architectures announced by many vendors are designed to be implemented within the context of a storage network; therefore, the "where" is either the server, the network or the storage device. There is another element of virtualization in addition to the "what" and the "where". This is called "span-of-control". For example, if virtualization software is implemented in the server, then logical or virtual storage presentation is implemented there, but it is mapped to storage that exists beyond the server. Therefore, span-of-control extends beyond the platform where the virtualization is implemented. There is a degree of predictability in virtualization implementation depending upon the core competency of the vendor. For example, it is likely that a server vendor will implement storage virtualization at the server level. It is equally likely that a software vendor will implement virtualization on a server platform. Typically in these implementations, virtualization's storage - presentation services - is done in the server, and is mapped to external storage. There is no control over external storage devices other than allocation. There is an opportunity within a server-centric virtualization approach to transparently exploit the multiple performance and cost characteristics of a multi-level storage hierarchy. In fact, the industry has flirted with this concept for years but it has often been rejected as too difficult and too people-intensive to implement. What if storage hierarchy virtualization was combined with policy services to mask the existence of a storage hierarchy from storage-intensive applications? This capability could also be implemented under a network-centric virtualization scheme. Some questions to ask of vendors implementing virtualization in the server:
Network vendors are not necessarily only going to implement virtualization in a network device, but it is likely. The definition of a network device for the purposes of this paper is a kind of hybrid storage domain manager or an intelligent router or an intelligent switch, and a platform that is capable of executing the storage virtualization. Presentation services are done at the network, and the logical devices are mapped to external storage devices. There is no control over external storage devices other than allocation. In a number of ways, the network is the most logical place to implement storage virtualization. It is neither a server, nor a storage device, so in existing between these two environments, it may be the most "open" implementation of virtualization. It is the implementation of storage virtualization most likely to support any server, any operating system, any application, any storage device type and any storage vendor. Maybe the most compelling reason to locate storage virtualization in the network is that then it would exist within the natural data path for all I/O activity. Also, in "seeing" all the storage devices and device types, it is a practical foundation for policy-based management intelligence. Questions to ask of vendors implementing virtualization in the network:
The third alternative for the "where" of storage virtualization is in the storage itself. This is an interesting implementation. If virtualization is done here, and the vendor is a storage vendor, then there are some challenges to avoid limiting the storage devices to just those supplied by the vendor. The storage vendor implementing storage virtualization might form a strategic alliance with a server vendor, a software vendor, or a network vendor to avoid creating a proprietary lock-in. But what makes this an interesting implementation is not the "what" necessarily, or the "where" at all, but the "span-of-control." When storage virtualization is implemented at the device level, there is an opportunity to have both the logical (virtual) environment and the physical devices within a common "span-of-control". Exploitation of this span-of-control - meaning management control of both the logical presentation services and the physical resources needed to satisfy the storage demand - could lead to capacity and operational efficiencies unavailable to virtualization implementations where the physical storage devices are external to the virtualization engine?s span-of-control. In fact, there are today two types of implementations of device-level virtualization where logical devices and physical devices exist within the span-of-control of the virtualization engine. These are virtual disk and virtual tape. For the purposes of this discussion, the benefits accruing to the fact that the span-of-control encompasses both the logical (virtual) devices and the physical devices, are very large efficiencies in capacity utilization in the case of the virtual disk, and very large efficiencies in tape media utilization in the case of virtual tape. The continuing poor utilization of storage resources in enterprise-class disk environments today is matched by inefficient overhead caused by historical and new practices. Typically, only 80% of capacity is actually allocated to files and data bases. That leaves 20% of capacity never allocated and reserved for growth factors. An additional 20% to 30% is wasted by being allocated for files that never grow to fill that capacity. That means that between 40% and 50% of available disk capacity may never be utilized. Point-in-time copies, used by many hardware and software vendors to minimize recovery times in the event of data loss can double the amount of capacity needed to satisfy application requirements. Application development challenges IT organizations to provide whole files and databases to test against, again consuming capacity. Time-to-market for new applications is also impacted by how often test files and databases can be reset after test failures. This poor utilization of capacity and increasing amount of overhead are exactly the kind of infrastructure cost issues that can be addressed by virtualization implemented at the device level and leveraging a span-of-control that includes both logical devices and physical resources. In tape today, virtualization is being introduced primarily to improve cartridge capacity utilization now that cartridges can cost $100. The unanticipated bigger benefits of tape virtualization have been application performance, and the ability to achieve 100% tape automation by leveraging existing libraries and drives to automate those cartridges that were still within a manual environment. This latter factor alone justifies the use of virtual tape in UNIX and NT. Questions to ask vendors of virtualization solutions implemented within the storage device:
|
|
|||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||
|
||||||||||