Server consolidation approaches: Scaling up and scaling out
Recent data center trends have seen active consolidation of storage and servers. Both IDC and Gartner/Dataquest have reported that over 80% of Fortune 1000 data centers have consolidated their servers, or have plans to do so. The goals of consolidation are to reduce capital and operating expenses. Capital expenditure savings comes from reductions in hardware requirements and through enabling technology that allows sharing of hardware resources to improve overall utilization. Savings in operating expenditures comes from improved business responsiveness and productivity, complexity reduction, centralized management (requiring fewer IT personnel) and easier overall information access. Server consolidation has taken two distinct and separate paths to meet these requirements. They are: scaling up and scaling out.
Scaling up: High upfront costs and TCO
Scaling up is the concept of using larger and more powerful multi-processor partitioning (or segmenting) of servers into multiple logical server images. Each image is typically dedicated to a single application. The computing architecture for these systems can be symmetrical multi-processors (SMP), non-uniform memory access (NUMA), or even multi-processor mainframes. The I/O is usually a separate proprietary shared subsystem that is shared or partitioned among the images. The Operating Systems (OS) is usually some form of UNIX or Linux. Examples of scaling up includes the SUN E10000, SUN StarCat 15000, IBM zSeries, IBM pSeries690, and HP9000 N-Class.
There are significant values in this scaling up architecture. One is the simplified, virtualized multi-server consolidation. The illusion of multiple server images is maintained while the complexity of managing these distinct images is greatly reduced. If more processing power is required for any single application image it is a simple matter to direct (on the fly) more processing power for that image. Another value comes from the I/O subsystem that allows all the images to share the I/O or even create virtual I/O links between the images. While this is a powerful advantage it is only available to those images within the particular server. Multiple scaled-up environments can be linked together via various fabrics to share storage and/or networks, but the costly I/O infrastructure that allows for efficient scaling up also acts as a drag on total cost of ownership (TCO) in a large scale, multi-server environment. The main downsides to scaling up are the very high upfront costs and the total cost of ownership. These are generally expensive machines that range in price from several hundred thousand to millions of dollars. The high cost has limited their access to the vast majority of the market. This untapped market still has critical server consolidation requirements.
Scaling out: Hidden complexity and cost
The alternative to scaling up is scaling out. Scaling out is the concept of using smaller x86 or RISC servers to create more powerful servers. It is designed to provide many of the same benefits and value of scaling up at a much lower cost. Scaling out can be considered a "pay-as-you-grow" architecture. It uses low cost, yet powerful small servers running Windows, Linux, or UNIX OS. By using small (1 or 2U) rack servers or even blade servers, server consolidation occurs physically. Clustering allows more processing to be added to any given application as needed.
The upfront costs and TCO for the scaling out approach for server consolidation were supposed to be orders of magnitude less than the scaling up approach; however, results have not matched the expectation. Management and other hidden costs are, in fact, in line with the scaling up approach and often exceed them. So where is the problem? The remaining barrier to lower cost server consolidation is clearly the dedicated I/O system found on every server. Because each server has dedicated I/O, it must be configured and connected to every LAN and SAN with which it wants to communicate throughout the enterprise. A server must also be connected to the IPC (inter-processor communications) network if it is part of a cluster.
Adding just a single server is a very disruptive event to all three networks. This disruptiveness creates an environment for multi-fabric networks that increases complexity exponentially as more servers are added into the environment. Greater network fabric complexity means more management. Business continuity and disaster recovery becomes increasingly more difficult and time consuming. This increasing complexity becomes the hidden cost multiplier. The irony here is that scaling out, by itself, has limited scalability, eventually becoming unwieldy and expensive. More importantly, the server consolidation goals of reduced capital and operating expenses are still not met. Disaggregating I/O from the server is required.
Scaling out with disaggregated, shared server I/O
InfiniBand, an industry standard architecture with high bandwidth and low latency makes it possible to easily transition from the dedicated I/O model for server architectures to one based on disaggregated -- or shared -- I/O. When scaling out is combined with disaggregated shared server I/O, all of the potential benefits of the scaling out architecture can be achieved and at a significantly reduced cost. Through this approach, the hidden cost multipliers from standard scaling out configurations are eliminated. When a new server is added to this system, it needs only to be connected to a common shared server I/O platform. This is true for LAN, SAN, and IPC fabrics. Predictably, server additions within this environment are simplified and much less disruptive. The number of cables and connections to the LAN and SAN fabrics, in fact, are reduced 50% or more while the need for a separate IPC fabric is eliminated completely. The LAN, SAN, and I/O can all be scaled separately from one another, as needed. This makes management simpler while meeting the server consolidation goals of reducing both capital and operating costs.
Server disaggregation has been evolving for two decades. Clients were first to disaggregate, then storage. Disaggregation has led to productivity gains for administrators. Disaggregated shared server I/O extends the same benefits associated with disaggregated clients and storage to server I/O. It has been available for years with large, scaled up servers such as mainframes and high-end multi-processor UNIX servers. Until now, it has not been available for the pay-as-you grow, scaled-out rack or blade server for true lower cost server consolidation. InfiniBand based shared I/O and clustering offerings can now provide a path to disaggregation.
About the author
Ira Kramer is the Director of Product Marketing for InfiniCon Systems.