Risk-to-benefit actuarial analysis often shows the cost to meet an organization's DRobjectives for all applications and associated data. This leads to a budget-driven DR approach that may meet the DR requirements only for the organization's most vital application data. The rest of the data is either completely unprotected or relegated to a DR approach that falls short of organizational and regulatory requirements. This is also known as the DRGAP (gravely abysmal protection).
The DRGAP is the difference between the level of DRrequired and the level of DR that can be afforded, or the difference between the actual level provided and the level required.
Use the ADCF to determine the value of the data to be protected. One method that works well is to assign the data to four categories: mission critical, essential, important and less critical (see "Establishing data's value," this page). For each category, you must set the recovery point objectives (RPOs) and recovery time objectives (RTOs). Next, each application and its associated data must be prioritized and assigned to a category. This isn't as daunting as it might appear. Ongoing surveys suggest that assignment should be based on application availability requirements, regulatory compliance and elimination of business risk.
Once you know the RTO and RPO for each application (see "Setting RPO and RTO benchmarks"), you can provide an operational framework for valuing or prioritizing the data (see "Correlating data to RPO and RTO").
The next step is to determine which available DRoptions satisfy each application/data classification's DR requirements and to then match their total cost of ownership (TCO) to the budget. You'll likely find that one size doesn't fit all and a mix of solutions will be required.
DR options
The number of DR options is large and growing. Each technology has advantages and disadvantages, and solves different aspects of the total DR puzzle. Afollow-up article in a future issue of Storage magazine will examine the merits of each DR technology in greater depth.
Matching the appropriate DRtechnology to the DR requirements entails calculating the TCO for each technology option. The TCO includes all capital expenditures (CapEx) as well as operating expenditures (OpEx) for the expected DR lifecycle (see "Simple method for calculating TCO").
When correlating the TCO of selected DR options to each data value, it's likely there will be more data classified as mission critical or essential than the DR budget allows. And in some cases, the application server and its mission-critical data are geographically dispersed and it's not financially practical to use the DRtechnologies that are typically applied. Current DR regulations may be another key factor in this mismatch.
In that case, it may be necessary to relegate some applications and their data to a lower level of DR. Unfortunately, relegating a portion of the data to the important or less-critical data pools may not be a viable option. It may make the organization non-compliant with current regulations, raising the specter of significant financial penalties. Another issue may arise if the organization lacks personnel skills at the required locations to provide adequate DR.
Eliminating the DR GAP
Bridging the DRGAPrequires putting aside the belief that one product will solve all your problems. You need to consider a combination of DR technologies working together in a layered solution. It's the synergistic combination that enables the organization to meet its needs and requirements within its budget.
For example, you may need to use server-to-server replication to a centrally located disk array for remote application servers. From the central location, that data can then be backed up to tape and/or snapped to a DR disk array. This would require fewer backup server licenses and allow better disk array storage utilization.
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Correlating data to RPO and RTO |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
|
 |
 |
 |
 |
 |
 |
 |
In another example, a continuous snapshot appliance replicates primary data onto a low-cost serial ATA (SATA) RAID or a massive array of idle disks (MAID) array. Then the RAID or MAID array asynchronously replicates the data over long distances to a DR site.
A third example may include an appliance-based distributed backup system that replicates remote sites to a central appliance. At the central facility, the data is then written to tape, MAID, etc. This leverages the DR target platform used for disk-to-disk replication as well.
The possibilities are endless, and each organization will have different needs. The trick is putting together the right combination of technologies that meets the needs at the lowest possible price.
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Simple method for calculating TCO |
 |
 |
 |
 |
 |
 |
 |
 |
 |
 |
Upfront work: Gather data from human resources and accounting. You need to know:
- Average annual IT personnel salary, plus the cost of fringe benefits.
- Number of business hours per business year (typically 2,016).
- Calculate the hourly IT personnel cost.
- The organizational cost of money: This is important for net present value (NPV) calculations. NPV is a simple time value of money calculation that adds credibility to the TCO calculation. It discounts the value of future cash flows. If this information is difficult to obtain, prime rate plus three points will usually suffice.
Next, estimate the capital expenditures (CapEx) and operating expenditures (OpEx).
|
 |
 |
 |
CapEx: This will be the easiest aspect of determining DR TCO. Hardware purchased or acquired with a capital lease is CapEx. Hardware acquired with an operating lease is OpEx. Software licenses are usually accounted for as OpEx. If hardware is acquired with a capital lease, the up-front CapEx is the NPV of the lease payments. This is calculated with the NPV formula.
NPV = Σ Ct / (1+rt)t
Σ = Sum
C = Cash flow (in the case of a capital lease, the monthly payment)
t = time periods (monthly, quarterly, annually, etc.)
r = Cost of money or risk per time period
(5% annually = .4167% monthly)
Each cash flow must be inserted into the formula and then summed. (This is built into Microsoft Excel spreadsheets and HP 12C calculators.) Next, you need to factor in the projected additional hardware purchases or additional leases using the NPV formula. Finally, the cost of any initial and ongoing structural and infrastructure improvements required must be determined using the NPV formula.
|
 |
 |
 |
|
OpEx: Calculating OpEx starts with fixed costs such as monthly maintenance, software license fees, contract work and professional services. All future expenditures should be discounted using the NPV formula. Growth must also be taken into account.
Be careful to correctly analyze personnel time expenditures, such as research, planning, preparation, implementation, management, operations, change management and troubleshooting. Then, multiply those hours against the hourly cost (don't forget fringe benefits) of personnel. If personnel receive overtime or bonus money for evenings, weekends and holidays, that must be taken into account. Once again, the NPV formula must be used for all cash flow expenditures.
|
 |
 |
 |
|
TCO: Add up all of the NPVs for all of the expenditures for both CapEx and OpEx to determine the DR option's TCO.
|
|
 |
 |
 |
 |
 |
 |
 |
Choosing a DR partner
Choosing DR partners is as important as the technologies that vendors offer. Careful attention must be paid to the following:
- The DR partners will work together. No one needs finger-pointing when problems arise.
- Partners should support the final solution.
- The total DR solution data must always be in a recoverable and usable state regardless of where the failure or disaster occurs. Usable also means as up-todate as possible based on the RPO.
- Database management systems can link to and recover from the DR data.
- The total solution must work with all current and planned organization applications, operating systems, storage, storage infrastructure, platforms, etc.
- With data storage growing exponentially (estimates are between 30% and 100% per year), the DR solution must scale with it. Assuming a 50% growth rate, DR for a terabyte of storage today will need to scale to more than 11TB in just six years.
- To maximize control and minimize multisite DR skills (to keep TCO low), it's indispensable to have a centrally located cross-system management console. Central consoles ought to provide an at-a-glance view of the state of all current, active DR configurations. The central management facility should allow initiation of any action that's required, regardless of the DRsolution's location. This means no IT personnel are required at the primary application server, allowing for "lights out" DR.
- Minimizing the need for user involvement (again, to lower TCO) calls for increased automation. Automated recovery from common failures, including server reboots, application crashes and network failures, can significantly reduce or eliminate the need for human intervention.
Meeting the DR requirements of the organization and regulators has become a challenge of sizable proportions. Matching DRrequirements, data and ITskills to the budget too often leads to a large GAP, but this DRGAP can be mitigated--and even eliminated--by building a sound DR foundation. There are six steps to establishing this foundation:
- Classify each application and its data into four categories: mission critical, essential, important and less critical.
- Determine the required RPO and RTO for each class of data.
- Determine the available DR options per class of data.
- Establish each option's TCO for the expected life of the implementation.
- Objectively evaluate the skills required at every DRlocation.
- Match the data, DR options and skills to the budget to determine the breadth of the DRGAP.
Employing multiple DR technologies instead of trying to force one size to fit all will help to shrink or eliminate the DRGAP. This approach takes more time to plan, implement and iron out the kinks, but the benefits are too compelling to ignore.