Part 3: How to choose the right virtualization technology
Ed note: This is the third of a series of three articles on virtualization from First Data Corp's Sr. Information Systems Analyst, Jerome Wendt.
To answer which of the virtualization technologies is the right choice, one must understand a bit of the history of SANs in the data center environment. SANs today have their roots in the mainframe environment. In the mainframe environment, storage had become unwieldy to manage since it was internal to the system. To solve this problem, some one had the bright idea to attach the storage externally. Once attached externally it solved part of the problem, but not all of it. The storage while physically separate, was still logically associated with a single mainframe. Hence the development of ESCON directors. These are physical devices that sit in the data path that allow a storage array to be shared among multiple mainframes or a single mainframe to access multiple storage arrays.
The open systems model has followed much of the same path to this point. However, a perplexing question has arisen for many. Why did a model that works fine for the most part in the mainframe world break down so quickly in the open systems world? The explanation is best given using terms borrowed from the relational database realm.
In the mainframe world, the relationship between mainframes and storage arrays worked because a 'one-to-many' relationship exists. The 'one-to-many' term originates in relational database system design that expresses a relationship between one object and its many attributes, or more simply speaking, one operating system and the many operations it manages. In the mainframe world, the 'one' in the 'one-to-many' is the operating system, MVS. The 'many' in the 'one-to-many', at least in this example, is the many storage arrays, be they from EMC, HDS or IBM.
So, in the above example, even though many physical mainframes may exist, the 'one-to-many' relationship works reasonably well because there is a single logical OS on all of the mainframes managing the many storage arrays. This single OS is intelligent enough to manage the storage no matter where it resides in the mainframe SAN world.
Now enter the world of today's open systems SANs. Multiple operating systems (Sun, AIX, Novell, Windows NT/2000, Linux (various flavors)) must connect to multiple providers of storage (EMC, Compaq, Dell, Sun, Hitachi, IBM, Xiotech, StorageTek, Fujitsu). Toss in multiple providers of the directors in the middle (Brocade, McData, Inrange, Vixel) and then further exaggerate the complexity with different hardware vendors on which these operating systems run. All in all, it translates into a management nightmare, a very real one for most organizations.
Now let's translate this scenario into database terms. This nightmare reflects a 'many-to-many' relationship. Now what is that? This is as a situation where many operating systems have many types of storage. From the view of a relational database administrator, this is an unmanageable scenario. However, there is a relational database technique that will make this scenario manageable called normalization.
Normalization in the relational database environment requires the creation of another table so as to create the 'one-to-many' relationship described earlier. This new table converts the previously unmanageable data into a manageable format.
This same technique must be applied to the SAN to make this environment manageable. By applying this technique, the result is not a new table in the SAN but a new network layer in the SAN. This new layer then converts the SAN from its present unmanageable state to a very manageable environment. This new network layer is the network based virtualization model.
Let's apply this to the Open Systems environment already described. You already have the many servers with Sun, AIX, NT/2000, Linux, Novell, and Apple in existence. Now introduce a new device in the data path at this new network layer so that all data traffic passes through it. Configure the device so it sees all of the servers and so all of the servers can discover it. You now have introduced a 'one-to-many' relationship on one half of the SAN.
On the storage end, you connect this new device to the many types of storage you may have, whether they are from EMC, Hitachi, IBM, Dell, or Compaq. Again, configure the device so it sees all of the storage and all of the storage can see it. Now the 'one-to-many' relationship is introduced on the other half of the SAN.
With the introduction of these two new 'one-to-many' relationships, you have now transformed the SAN into the manageable design described above, simplifying it in the process. So using this proven relational database technique, one should be able to see why the network based virtualization strategy is the only logical and sensible choice to make SANs manageable. It also helps to explain why major vendors are adopting this method as their long-term strategy.
For those vendors who choose to try to support the other two main virtualization models, either the host based or array based in the Open Systems environment, they will have only limited success. Their solutions will be highly propriety, expensive, be difficult to administer despite the cost and will not easily scale in an enterprise environment.
Now these other models might work. But in order for them to work, the control for purchasing and managing the components of the SAN will need to be held by a very few individuals. This will likely create a stranglehold on the organization. But this scenario does not accurately reflect most organizations and most organizations do not desire this much control in the hands of so few.
Hence, the network based virtualization model is the only logical choice. While this model may be disparaged at times in the popular press, this will be the model that emerges if, for no other reason, than it has to. In fact, major enterprise providers like IBM, Hitachi LTD, Fujitsu Softek and Veritas appear to have already come to this same conclusion.
Hopefully one now more clearly understands the term virtualization and can see from the arguments presented that of the three presented, the network based virtualization model is the only one that makes sense long term. It should, in theory, ease the management burden of the SAN while also opening up whole new ways of thinking about storage and storage networking in the Open Systems arena.
But perhaps more important to systems managers, it should allow them to use what they already have more effectively. It will also save money in the long term without spending a great deal of new money in the short term to accomplish this. It is the best long term strategic decision for it is an objective that you may start to achieve now by spending less while purchasing the products you need more wisely. And, as a side benefit after all is said and done, you might just get to keep your job.
About the author:
Jerome Wendt is a Sr. Information Systems Analyst for First Data Corp. He is responsible for Managing and resolving performance related issues. Jerome is also responsible for exploring new SAN and open systems storage related technologies to solve business and technical problems in the data center.
Click to read the first installment of "Normalizing a SAN".
Get definitions of virtualization in Part 2 of Jerome's tip.