The recovery point objective and the recovery time objective are two parameters closely associated with recovery....
They both influence the type of redundancy or backup infrastructure you will put together. The tighter the RPO and RTO, the more money you will spend on your infrastructure.
The RPO marks the age of backup files that an organization must recover to resume normal operations following an incident. The RPO's quantity is listed in time, from seconds to days, and it dictates the allowable data loss: How much data can you afford to lose? If you do a nightly backup at 7 p.m. and your system goes up in flames at 4 p.m. the following day, you lose everything that was changed since your last backup. Your RPO in this particular context is the previous day's backup. If your company does online transaction processing, your RPO may be down to the latest transaction and the latest bits of information that came in. Again, that dictates the kind of data protection offering you want in place.
The RTO is the maximum amount of time an organization can be down following an incident before normal operations must be back online. This is often associated with your maximum allowable outage or maximum tolerable outage. Like the RPO, the RTO's quantity is measured in time, from seconds to days. For example, if the RTO is one hour, the organization needs to recover within an hour following an incident.
The RTO dictates what type of architecture you will put together, whether it's a high availability cluster for seamless failover or something more modest. If your RTO is zero -- your organization cannot withstand any downtime -- you may choose a completely redundant infrastructure with replicated data off-site and so on. If your RTO is 48 hours or 72 hours, then tape backup may work for that application.
The RPO and RTO are important because they determine what an organization must use for backup and disaster recovery (DR) platforms. An organization with a tight RPO and RTO will need more expensive and intense backup and DR than a business that can allow for longer recovery time and more data loss.
The RPO and RTO give an indication in dollar figures of how much it would cost for an organization to be down for a certain period of time. Naturally, no organization can afford to be down for any amount of time, and no organization wants to lose any data. However, it must be realistic in drawing up its RPO and RTO.
For more information on RPO, RTO and DR, listen to Pierre Dorion's podcast.
How to calculate them
When an organization maps out its RPO and RTO, it needs to think not only in terms of time and money, but reputation. How long could the organization be down, and how much data could it lose, before it starts to lose customers and weaken its status?
Because the RTO is the goal time for an organization to restore its service, establishing the RTO starts at the business level. The same goes for the RPO. Calculation of these figures requires information from the departments that make the organization run. Some systems will need to be back online quicker than others. Determining which systems are most critical is part of figuring out the RPO and RTO.
The organization will determine how much it costs to go down for periods of time and how much it costs to lose specific amounts of data. These costs include lost revenue, salaries, weakened stock price and the expense of the recovery. If an organization wants to be conservative in its calculation, it will consider the worst time for an incident to occur.
All of this data is then carefully considered to calculate the RPO and RTO.
As with an overall disaster recovery plan, an organization will experience changes to such elements as systems and employees, so it needs to adapt its plans accordingly. For example, as a business grows, it may need a quicker recovery time and, as a result, an improved DR platform.
Testing can help determine a more accurate RTO. For example, if the RTO is set for two hours and it takes one hour to recover during the test, the organization could make that change.
As data volumes continue to grow and threats such as cyberattacks increase in intelligence, fast recovery is usually expensive. As a result, it's important for an organization to realistically analyze how much it can afford to lose and how much it can afford to spend on recovery in a given period of time. Replication, for example, is great for DR but gets expensive depending on how much data a business is replicating. The same goes for cloud backup and recovery; while the cloud starts out inexpensive, it becomes much costlier when data sets grow exponentially.
Core similarities and differences
The RPO and RTO are independent, different numbers, but they can also inform each other. They are core elements of a business continuity and disaster recovery (BC/DR) plan and are used in the development of a service-level agreement.
Storage expert W. Curtis Preston discusses some major factors to keep in mind when reviewing RPOs and RTOs.
The RPO, since it deals with data loss, helps inform the development of a backup strategy. The RTO, since it deals in time to recover, helps inform the development of a DR strategy.
The actual amounts can be dramatically different. Mission-critical applications may need to be back online within 15 minutes (RTO), but it's OK if the most recent data is from six hours ago (RPO). It's possible that an organization will miss that tight RTO but hit the looser RPO, which should lead to an analysis of the DR platform. If an organization hits its RTO but misses the RPO, it should take a look at its backup system.
How they factor into continuity/recovery plans
The RPO and RTO are key pieces of a business impact analysis (BIA). The BIA, itself an important part of a DR plan, determines the potential effects of an interruption in service to an organization. Along with the risk assessment, a BIA is one of the first elements of comprehensive DR planning.
The RPO and RTO set the stage for determining BC/DR strategies and products. An organization with an RPO and RTO measured in minutes will have a drastically different backup and recovery platform than one measured in hours.
Point 1: Recovery Point Objective. The maximum sustainable data loss based on backup schedules and data needs.
Point 2: Recovery Time Objective. The duration of time required to bring critical systems back online.
Point 3: Work Recovery Time. The duration of time needed to recover lost data (based on RPO) and to enter data resulting from work backlogs (manual data generated during system outage that must be entered).
Points 2 and 3: Maximum Tolerable Downtime. The duration of the RTO plus the WRT.
Point 4: Test, verify and resume normal operations.
Tolerance for any downtime and data loss continues to go down. Thankfully for businesses, backup and DR products and technologies have improved to help meet tight RPOs and RTOs. Virtualization, replication and cloud-based backup and recovery make the recovery process simpler and quicker. Data reduction techniques, such as compression and deduplication, help to combat growing data sets, though, in some cases, an organization may need the complete data set recovered. Storage tiering, which can classify and place data according to its importance and timeliness, also helps with growing volumes.
Improved technology and products come with added costs. It's up to the organization -- through its calculation of the RPO and RTO, as well as its overall continuity and recovery planning -- to determine which investment is worth the money.
Keep your RTOs updated