This article can also be found in the Premium Editorial Download "Storage magazine: Top 15 Storage hardware and software Products of the Year 2006."
Download it now to read this article plus other related content.
|How long until the ERP app is up?|
It's a familiar refrain: The business unit wants to know how long the application will be unavailable if it sustains a failure and IT's standard answer is "It depends." Why? The time required depends on the type of clustering, the operating system, the time for the database to be returned to a stable state, as well as other factors. When an active-passive cluster sustains a failure, there are several steps required to prepare the passive server to activate the application.
Business function vs. ERP app
In the 1990s, I was part of a team that implemented a clustered SAP environment using the latest techniques available at the time. We were confident that if we sustained a server failure, we'd be able to keep the SAP application, along with the central instance, up and functional. Users might see a pause of the application, but it would be up and running in minutes.
Several months into our implementation we sustained a server failure, and our SAP instance moved as planned from the production server to the failover server. Before the IT team could celebrate its success, however, the business unit reported that the company wasn't able to accept Electronic Data Interchange (EDI) orders. The EDI application wasn't communicating with the active SAP application--a significant problem because 85% of the company's orders were received electronically.
Though we had carefully protected the SAP application, the SAP central instance and the underlying Oracle database, we failed to protect the business function of taking and processing an order. Most business functions rely on several applications that must also be protected to increase the availability of what's important to the business.
For an application to provide data to users, three elements must work in harmony: the user request must traverse the network to the correct subnet; an application must be running at the IP address that answers the request; and the application must have access to the underlying data. Clustering software controls these aspects of responding to a user request.
In the case of many ERP applications, "load balancing" is built into the application architecture, which gives the ERP application additional scalability. ERP applications, like Oracle and SAP, are architected to support multiple "application servers." The application servers can respond to a user request, but because there are multiple servers at this layer and they don't have access to the data, these servers aren't single points of failure and therefore aren't clustered.
Each application server must communicate with a database server. The database server and underlying storage are considered single points of failure. Because clustering is all about availability, ERP clustering activities are focused on the database server (see "How long until the ERP app is up?"). The goal is to eliminate the single points of failure of the database server and its underlying storage.
This was first published in February 2007