Databases contain a firm's crown jewels, so organizations are averse to exposing these assets to relatively unknown products. But existing techniques can't keep up. Additional hardware doesn't address issues like extracting old or unneeded data from the database. Database tools can purge old data to resolve an immediate performance problem, but they don't automate the archiving process or create compliant data stores.
With app performance tied more closely to underlying database and storage architectures, firms need to identify what benefits they'll derive from a database archiving tool and what characteristics to look for. Database archiving vendors suggest that production databases ranging in size from 50GB to 200GB could benefit from database archiving software. Among the benefits of database archiving are:
Justifying database archiving
Database archiving vendors find that the justification for their products depends on the problems users want to solve. Although database archiving vendors often present their software as a means of lowering storage costs and starting the information lifecycle management (ILM) process, Adam Gwosdof, CTO at Applimation Inc., finds this message doesn't resonate well with users.
"Users are focused on applications and solving problems related to them," observes Gwosdof. "They could care less about ILM." Instead, he cites improved performance and compliance concerns as the primary motivators driving the implementation of database archiving software.
Lois Hughes, senior business application systems manager at Tektronix Inc., cites performance as the primary driver behind her implementation of OuterBay Technologies Inc.'s LiveArchive product. (OuterBay was acquired by Hewlett-Packard earlier this year. The LiveArchive product is now called Relocater and is part of HP's Reference Information Manager [RIM] for Databases product suite.) The two Oracle databases she managed were 35GB and 85GB. She saw initial space reductions of 48% and 56%, respectively, with corresponding performance improvements of 40% and 52% on the two Sun servers hosting the databases.
Jim Lee, vice president of business and product line management at Princeton Softech Inc., says there are signs that may indicate the need for archiving software. Regularly executed processes that take longer to run each time are a key indicator a database needs attention. Long-running processes may include:
- Daily, weekly or monthly batch jobs
- Monthly or quarterly closing statements
- Backups
- Disaster recovery tests
Another infrequent but problematic process that may indicate a need for database archiving software is the upgrade of a large database from one version release to another. Apps that contain hundreds of gigabytes, or even terabytes, of data can rarely be down for extended periods for admins to perform database upgrades. There's also often precious little time to roll back should the database upgrade go awry. Database archiving software lets admins slim down the database prior to the upgrade, thereby reducing the amount of time needed for the app update.
Removing data from the production database and archiving it to a different set of media also lets organizations meet new compliance requirements such as Sarbanes-Oxley, HIPAA and SEC 17a. Archiving the data to WORM disk media such as EMC Corp.'s Centera or Network Appliance Inc.'s SnapLock gives users relatively fast retrieval access (10 seconds to 60 seconds) to the data, and stores it in an unalterable format that meets most regulatory requirements. For longer term or offsite requirements, archiving software can be used to store the data on WORM tape or optical media; retrieval times will depend on the type of media and whether it's stored onsite or offsite.
Data archiving
The prospect of archiving database data can be intimidating at first. One of the misconceptions Marc André, MIS application manager at Chivas Brothers, Paisley, Scotland, encountered when he proposed implementing Princeton Softech's Active Archive software (renamed Optim) was that some users and administrators equated archiving data with deleting data. Some education was required to explain how database archiving products move data from the production database to the archive.
Prior to doing any data archiving, data-retention policies must be established. If it's an internally developed application, more time may be required to understand and document how the application works, and which records can be archived and when. On the other hand, some vendors such as Princeton Softech certify their products with existing apps such as JD Edwards, PeopleSoft and Siebel; HP's RIM for Databases certifies its products with SAP.
App vendors test HP's RIM for Databases software to ensure that it works with their app's public APIs and archives the data in accordance with the way the app expected; how the archiving process worked is then documented. This certification provides users with a level of assurance that archiving will work for these apps, which should get them up and running more quickly.
Tektronix's Hughes initially planned to spend two years developing an archiving solution for the firm's accounts receivable (AR) and customer fulfillment Oracle apps until she discovered HP's RIM for Databases Relocater software. It took two weeks to implement Relocater with the AR app and six weeks to implement it with the customer fulfillment app. "The AR installation was a plain-vanilla installation and went much more quickly than the customer fulfillment, which was slowed by customized policies we had added to the application," notes Hughes.
Users should take precautions and verify that they haven't internally customized the policies of a certified app beyond what it normally delivers. Customization can counteract the certified default policies designed for the app and database archiving software and void the certification. If the app was customized in some way, users will need to adjust the database archiving software's default policies to fit their environment.
The next step is for the database archiving software to copy the indexes, names and schemas of existing production tables, as well as the relationships among them, to the archive. For instance, prior to doing any archiving, Applimation Inc.'s Informia Archive examines the production database to confirm that all relationships between parent and child tables are intact, the transaction is closed, and that the transaction identified for archival meets defined business policies.
HP's RIM for Databases Relocater and Encapsulated Archive each employ HP's RIM for Databases logical unit of work methodology to do archiving. These tools execute the following steps as part of the archiving process:
- They evaluate data-retention policies and constraints to ensure that a business transaction can be archived/relocated.
- They copy the business transaction to the archive.
- They delete the business transaction from the production database.
Another decision that must be made at the outset is whether to archive the data to a database or a flat file. The Applimation and Princeton Softech products support the database or flat-file format; HP's RIM for Databases Relocater is used solely for databases, while its Encapsulated Archive product is targeted at flat files. The archive format decision will hinge on four factors: the need for a secondary instance of the database, native app access to the archived data, performance and the media to be used for the archived data.
If storing data in an unalterable format is the primary driver for database archiving, then the data must be stored in a flat-file format such as XML. You can store a database on WORM, but with each new set of archived data the entire database must be archived again, which is neither space nor time efficient.
When creating a secondary instance of the database or providing native application access to the archived data are the principal concerns, the archived data must be stored in a database. Data stored in this format will generally outperform data stored in a flat-file format, but databases can only be stored on standard storage arrays. However, users may choose to put either Fibre Channel or ATA disks in the array depending on the performance required.
Helen Cha, senior director of marketing at OuterBay before its acquisition by HP, cautions that the service-level agreements associated with archived data are very different than those for production databases. She suggests organizations consider how many users will have access to the archive and at what times. She also recommends reconsidering backup plans for database archives because archived data doesn't change; backups only need to be run when new data is added to the archive.
The initial archival run will generally cause the most consternation for users because it's new and likely to be the largest amount of data the archiving app will ever have to deal with. To mitigate these disruptions, database archiving software includes utilities that employ best practices for database management. HP's RIM for Databases Relocater and Encapsulated Archive give users the option to suspend and resume the archiving process. Princeton Softech lets users designate a "cap" that terminates the archiving process after all necessary background processes are completed.
Users may also suspend the archiving process at any time due to performance concerns or an approaching period of heavy production database use. When an archiving operation is suspended, archiving software should roll back the transaction, leaving both the production and archive databases in a consistent state.
Transparent data access
Getting data into the archive is one thing; allowing the native app to seamlessly retrieve it is quite another. Even though each vendor offers this feature, the task is handled somewhat differently by each one. For instance, when Applimation's Informia Archive is deployed in conjunction with packaged business apps such as PeopleSoft, Oracle Applications or Siebel, the information is first relocated to an online archive database for continued, seamless access through the native app's screens and reports. Once the data in the online archive is moved to WORM or any flat-file format, it's inaccessible to the native app and can only be accessed using Applimation's search facility.
HP's RIM for Databases Relocater supports different archiving options for accessing current and archived data. Relocater may archive to the same database, but to a different table space or an entirely different database. Tektronix's Hughes chose to archive to an entirely different database, and HP's RIM for Databases lets users view the archived data in the same reports and forms they see in the production environment.
All of the vendors enable users to do a query that pulls back all pertinent information whether the data resides in the production database, an archive database or flat file. For instance, Princeton Softech's Optim offers this query capability in its core product, while HP's RIM for Databases leverages its Transparency Layer and Combined Reporting features to virtually aggregate the data across different data repositories.
HP's RIM for Databases Virtual Data Reload option lets archived data be updated securely without requiring a physical reload, provides built-in safety checks and creates an audit trail. The admin then assigns authorized users and designates specific transactions as eligible for reinstatement. Users can then make specific changes to archived data using existing app interfaces. While users make these updates, HP's RIM for Databases validates the transactions so that only authorized users and programs can update the data for the period in question.
Database archiving software provides administrators, management and auditors with unfettered access to years of data while lowering costs and improving production performance. With the ability to store data in different formats on multiple types of media, database archiving software has matured and offers reliable features. Users can look to database archiving software as a viable method to make more intelligent decisions about managing the data in their databases.