Enterprise resource planning apps often hold a company's most important information and have unique storage requirements. In this first installment of a three-part series, we look at ways to protect ERP data while improving performance.
Enterprise resource planning (ERP) applications are typically entrusted with a company's crown jewels--key information that can range from accounting to human resources records. But many storage administrators see ERP as a minefield where critical information, multiple upgrades and competing priorities among database administrators, application managers and system administrators can collide. The result is a volatile environment that can produce inconsistent performance and become a management nightmare.
Recent advances in storage offer more efficient ways to manage ERP applications. Replication, snapshots, clones and disk-to-disk-to-tape (D2D2T) backup are technologies that can be leveraged to help manage an ERP application. But before these are used, a basic storage infrastructure must be configured to support the ERP application. The configuration can then determine the overall performance of your ERP application.
Configuration parameters set within the database application, the database server OS and the storage array significantly affect overall ERP application performance. In most cases, the configuration parameters are established by administrators from different functional groups, each with their own goal. Without appropriate coordination, the ability to protect ERP data and the overall performance of the ERP application for end users is compromised.
In this three-part series, we'll review how to effectively configure and manage the storage infrastructure to support a production ERP application. This article focuses on establishing an efficient storage infrastructure for ERP apps.
A typical mission-critical ERP environment includes development, training, QA and production servers. Within the production environment, the ERP application has a production database server and multiple production application servers. Three-tier applications like SAP are architected to scale CPU and memory through the addition of more application servers, but there's typically only one database infrastructure. The I/O access for the database server to the storage infrastructure can set the performance characteristics of the entire ERP application. Our performance discussion will concentrate on how to appropriately configure the storage array, database server OS and database application to maximize I/O from the database server to the storage housing the data.
Database file types and structure
Before diving into configuration parameters, let's review database architecture from a storage perspective. Most traditional database architectures consist of three basic types of files: executable and administrative space, log file space and tablespace. These files are located on mount points that correspond to logical volumes which, in turn, correspond to disk groups within the array. The files are ultimately placed on disk groups within the array and are used to ensure that transactions are fully committed while offering the ability to recover the database to a specific transaction or time if required.
From a storage administrator's perspective, the files represent different storage needs and required parameters to ensure maximum performance of the production application. Because of past limitations on physical disk I/O and size, databases were built to scale using 2GB disks with limited I/O. Database administrators were taught to architect the database using these now-antiquated techniques. In today's environments, physical disks are virtualized a couple of times before the operating system can even view the storage. By simplifying the files into three basic buckets, a storage administrator can gain a better understanding of how to architect the underlying physical disks into disk groups for the database server. Database architecture is much more complicated than the following three file types, but this approach is based on the storage administrator's perspective.
Executable and administrative space. This space doesn't require significant consideration by the storage or system administrator. It should be protected from disk failure through mirroring, and protected from corruption by backup to a secondary disk or tape. In an Oracle/SAP environment, the mount points are typically referred to as /usr/sap/<SAPID>, /sapmnt/<SAPID> and /usr/sap/trans.
Log file space. Unlike executable and administrative space, the space configured for log files is critical to the overall performance and security of a database app. Each update or transaction to the database is logged or journaled through these files. I/O to the mount points housing these files is critical to overall database performance. The storage array, OS and database app must work in concert to maximize I/O to this space.
For example, disk groups within the array should have enough spindles to maximize I/O without causing unnecessary waits for a small piece of data to be written to 20 disks. For the log file space, create two RAID 1/0 disk groups that have approximately four physical disks with a stripe size of 64KB or greater. The disks should be striped and mirrored to provide high I/O and protection. After the space for the log and archive files is allocated from this volume group, the remaining space can be used for tablespace(s).
Database tablespace. The actual row/column information is stored in multiple files managed by the database and app. In our SAP/Oracle example, these are mount points /oracle/<SAPID>/sapdata1, /oracle/ <SAPID>/sapdata2 and so forth. Disk groups should be built from a minimum of six disks and be protected using RAID 5.
Once the physical disks have been grouped into appropriate disk groups and presented to the server, system admins can handle logical configuration, formatting and mounting the space to the file system. Because the array is providing RAID 1/0 and RAID 5 protection for the log file space and tablespace respectively, the system admin shouldn't mirror or stripe these logical volumes. While volume managers have this capability, the array capabilities can be leveraged to provide this type of protection. The system admin should format the file-system space with a block size that matches the write size of the database--an 8KB block size in most cases. Even with 32GB or 64GB disk groups, this block size should be adequate.
Path access to data
Another consideration at the file-system level is path access from the server to the storage array. The array will provide access to the same disk groups down multiple paths. The system administrator should leverage OS-based, load-balancing capabilities to remove several single points of failure. These capabilities may be built into the OS (e.g., MPIO or PV Links) or come from a third party (e.g., Symantec Corp.'s Veritas Dynamic MultiPathing or EMC Corp.'s PowerPath). Regardless, secondary path configuration must be completed at the OS level to enable access to disk storage through a secondary path.
Disk failures are protected through RAID 1/0 and RAID 5. Host bus adapter card, fibre cable and array storage processors are protected using multipath techniques. All of these schemes protect against hardware failure, but they do nothing to protect against corruption at the database layer. In some cases, the storage administrator can bail out a database administrator who dropped the wrong table because modern storage arrays have the capability to protect the database application from corruption. With the array providing specific disk groups with characteristics that are tuned for the database log files, tablespaces and executables, and with the logical volumes created with the correct block size and configured for multipathing, we can turn our attention to protecting the ERP application.
The critical nature of an ERP application requires 24/7 availability and that all transactions be protected from loss. These two characteristics drive organizations to leverage the advanced capabilities of their infrastructure. Storage and database administrators have skills that, when applied together, can provide significant application availability and protection.
|DB2's new 'Viper' release offers more SAP support|
Up to 60% of SAP customers have deployed SAP applications on Oracle databases, according to Noel Yuhanna, a senior analyst at Forrester Research Inc., Cambridge, MA. Because large enterprise resource planning (ERP) systems are usually chosen to operate with highly scalable, enterprise-performance workhorse databases like Oracle, it isn't surprising that other database vendors are trying to make a dent in Oracle's SAP market share.
IBM Corp.'s new DB2 release (code-named "Viper"), which was introduced in the summer of 2006, has integrated features for SAP. One of the more important features of Viper is that it includes an XML storage engine in addition to a relational storage engine. Philip Howard, research director, data at Bloor Research in the U.K., says "... performance comparisons by early adopters of Viper indicate performance gains on queries of 100 times or more, development benefits of between four and 16 times ... the ability to add fields to a schema in a matter of minutes as opposed to days."
Howard says that in addition to the XML storage engine, the new release of Viper will include new data compression capabilities such as row compression. IBM believes its compression techniques will result in savings of approximately 35% to 80%. Because there are special SAP facilities included with the new Viper release, the savings could be on the higher end.
Of course, other database vendors contend that their databases are best suited for the midtier market. In an Accenture white paper called "Coming of Age: SAP on Microsoft," the claim is made that companies can reap a huge cost benefit by using SQL Server instead of other databases (such as Oracle and DB2) that are traditionally viewed as the back-end databases of choice for ERP systems. By using commodity, Intel-based servers running Windows and SQL Server, Accenture claims some companies can potentially save 20% to 70% on hardware costs. In fact, Accenture states that "at certain clients, we have identified up to 70% savings on database and operating system licensing and administration costs." It's also interesting to note that Accenture claims that the top shipping platform for SAP over the past three years has been Windows, and not a Unix-based operating system.
Protecting a database from data corruption requires an almost Einstein-like ability to manipulate time. Once a database has been corrupted, it must be reset to an earlier point in time. Like a storage array that can take a snapshot, or a file system that can journal every change, most database applications journal every transaction through a group of redo logs and archives. While each database brand has a unique method for achieving this goal, in general, every transaction is logged so that it can be reapplied or removed if required (see "DB2's new 'Viper' release offers more SAP support"). Marry this capability to the cloning capability in most storage arrays, and you have a highly available and highly recoverable application.
The steps to accomplish a point-in-time recovery are easy to document, but the execution of the task within a functioning data center can be a challenge. The following is the basic sequence of events:
- Stop all updates to the database
- Create a copy of the data on secondary media
- Allow updates to database
Stop all updates to the database. "Stop" may be too strong a word to use when describing Step one. If we truly stop all updates, then end users see the ERP application as unavailable and that's typically not acceptable. Most databases have the ability to suspend updates to the primary table space--also referred to as putting the database in backup mode. During this time, the updates to the primary tablespace are suspended and the database application starts to create a secondary journal of updates that will be reapplied once the suspended mode is complete.
Create a copy of the data on secondary media. Copying ERP data can take days, hours, minutes or seconds depending on the process followed. Storage infrastructure can play a big role in reducing the time required to create a copy and the time required to hold the database in a suspended mode during the copy. For instance, most storage arrays today can create a clone or snapshot of disk groups. Cloning, sometimes referred to as a full snapshot, offers more protection and better performance than the typical copy-on-write snapshots, but cloning requires enough available storage to hold the entire copy of the database.
All of the disk groups used for the application should have a full clone created at all times. That way, when the database is in backup mode, the secondary clone copy can be split or fractured so that the database administrator has a full copy of a recoverable database on disk that can be used for recovery. Once the fracture is completed, the database suspend or backup mode can be released. With a full copy of the database in a recoverable mode, snapshots can be taken against the clone to drive additional secondary copies. The snapshot requires a small amount of disk space, which can be mounted to a backup server and copied to tapes or other disk.
Our sequence of events is now as follows:
- Synchronize clone copy to the primary database
- Suspend updates to the database
- Fracture the clone and hold as gold copy
- Release the suspended database
- Create a snapshot of the clone for backup
- Backup snapshot of the clone to tapes
- Go to Step one
Using backup applications
The process described earlier typically occurs at least once every 24 hours and sometimes several times a day. The process needs to be fully automated, but that can be difficult. Commands need to be executed at the database level, the storage array level and within a backup application to properly execute an appropriate backup of the database.
Some business environments may not need 24/7 access to the ERP application. In those environments, the backup application can be used to control backups of the tablespaces, executables, log files and archives at a file-system level. In many cases, the database administrator will export the database file system to provide a secondary recoverable copy of the database. This exported tablespace doesn't offer point-in-time-type recoveries and, as the database grows, the export can be time consuming.
In a larger environment where a company may need to "roll" the database forward to a specific time or transaction, additional database integration is required. When direct integration with the database application is required to suspend the database, enterprise backup apps (Symantec's Veritas NetBackup and EMC's NetWorker, for example) integrate with the database technologies (e.g., RMAN) at the ERP application level. Enterprise backup applications can also control the storage array to leverage the array's ability to create a separate clone of the production volumes. Another advantage of leveraging these types of enterprise backup and recovery applications is that recovery can be controlled from this point as well.
|Practicing ILM in an ERP environment|
When companies implement an information lifecycle management (ILM) philosophy, the first task is to determine the value of data over time. The data must be categorized by current value and value over time before it can be managed to the appropriate level of storage. A better question in an ERP environment might be "What or who can actually read the data?"
Databases lock data up in tablespace files that look like big files that are constantly in use. A system administrator can see oraclesapdata23, but not what's inside. Traditional file-system ILM strategies are useless for putting older data on cheaper disk. But ERP apps have their own form of ILM.
A data warehouse coupled with an ERP application could be considered a form of "application-based" ILM. The database administrator can determine when the data in the tablespace ages, as well as how that aged data can be moved from the online transaction processing system to the data warehouse system for long-term storage.
While traditional ILM technologies like EMC Corp.'s DiskXtender can't step in and manage the tablespace, these types of apps can be considered in the management of archive and log files. Database administrators establish log files and archives to ensure they can recover a database to a specific point in time. If the database finds that it can't create a log file or an archive file as required, all transactions stop and the ERP application fails. By coupling ILM technologies with the log files, you can protect the database and ensure that the appropriate level of disk capacity is maintained.
Finally, third-party job scheduler applications are used in the most complicated environments--typically when the infrastructure used for backup and recovery spans multiple data centers. Jobs are developed within the scheduling applications that control the sequence of events to back up the app. The jobs can be monitored and can have different restart characteristics depending on where the sequence failed. An automated process that's controlled from a single application needs to be developed to provide this type of backup. In addition, it's important to think about applying ILM processes to some of the ERP data (see "Practicing ILM in an ERP environment").
There's no debate that ERP applications are tough to administer and protect. While ERP applications can cause significant stress to IT personnel, if database administrators, storage administrators and systems administrators all work as a team, the ERP application's performance and protection will be significantly increased. Again, appropriate coordination is the key to establishing a backup copy of an ERP application that can be recovered. Leverage the capability of the database to suspend transactions to the primary tablespace, and use the clone capabilities within your array and the backup application to move the copy of the data to secondary media. But data corruption isn't the only potential loss that IT administrators must be concerned with. In a future article, we'll describe how to protect an ERP application through a site failure by leveraging local and remote replication, and clustering techniques.