This article can also be found in the Premium Editorial Download "Storage magazine: The business case for solid state vs. disk storage."
Download it now to read this article plus other related content.
One of the by-products of the OST interface is better backup and recovery performance. Maybe that will get more backup vendors to develop their own APIs.
EMC Corp. recently announced its Data Domain Global Deduplication Array (GDA) that optimizes data deduplication in large-scale environments by aggregating the data storage capacity of two of its deduplication appliances to improve throughput performance and scale. In terms of delivery, one of the key enablers of GDA's ability to distribute deduplicate processing is Symantec Corp.'s OpenStorage (OST) API technology.
Symantec OST is an API for NetBackup (Versions 6.5 and higher) and Backup Exec 2010. Partners leverage the API to write a software plug-in module that's installed on the backup media server to communicate with the storage device, creating tighter integration between the backup software and target storage. In short, it's an interface that speeds up backup for NetBackup customers. The only problem with OST is that it highlights the fact that other backup vendors don't offer a similar capability.
Originally, the Symantec OST API was published to provide Symantec customers with a common interface to third-party disk targets. It allows backup data to be stored on disk with whatever protocol the target device uses, such as Fibre Channel (FC) or TCP/IP. Symantec backup software sees OST-enabled appliances as disk and enables features such as intelligent
It also delivers optimized data duplication -- network-efficient replication and direct disk-to-tape (D2T) duplication that's monitored and cataloged by the backup software. Without Symantec OST, there are two scenarios: allow the storage device to transfer data without the backup catalog being aware of the copies, or transfer data from device to media server to device to keep the backup catalog aware of the copy. In the first scenario, the backup catalog is left out of the loop on the location of backup copies. This can create complexity and impede disaster recovery (DR) processes. The latter scenario increases LAN, WAN and SAN network traffic, and removes the benefits of deduplication in network transfer. Clearly, deduplication controlled by OST-enabled devices creates savings in both time and bandwidth requirements.
Because the catalog is aware of all copies, recovery of data from an OST-optimized duplicate copy is the same as recovery from another duplicate. Through the backup application, the OST-optimized duplicate copy can be designated as the primary copy, and then a full or granular recovery can be initiated. The potential time savings when compared to recovery from a non-OST-optimized duplicate could be significant.
Vendor OST adoption
Many backup target device vendors have subscribed to the Symantec OST API, which isn't surprising given its benefits and Symantec's market share. Vendors with support for OST in conjunction with NetBackup and/or Backup Exec include EMC, ExaGrid, FalconStor Software, GreenBytes, IBM, NEC, Quantum (the only vendor so far to support OST direct-to-tape support with NetBackup) and Sepaton. It's also worthwhile to note that Symantec supports its own deduplication implementation in NetBackup and Backup Exec with OST.
One of the by-products of the OST interface is a performance improvement in backup and recovery operations, with some vendors claiming upwards of a 100% increase in performance. EMC's OST option for its Data Domain appliances was aptly renamed "Boost," a testament to its performance advantage. In creating its OST plug-in, EMC enhanced communications and optimized the packaging and transfer of data between backup media server and storage device, thereby improving performance.
EMC became more innovative with Data Domain GDA, taking advantage of the OST API to distribute a portion of the deduplication processing to the backup media server, which EMC claims lowers media server CPU utilization. And because deduplication occurs earlier in the backup data path, the implementation eliminates some redundant data at the media server, and reduces the network load between media server and storage.
In a similar move, NEC leveraged the OST API to optimize load balancing. While one of the inherent benefits of OST is to enable disk pooling for better overall backup system load balancing, NEC took things a step further. Hydrastor, NEC's storage platform offering data deduplication, has a scale-out grid architecture employing one or more logical storage units (data movers) and storage. Through OST integration, the backup application can now automatically distribute jobs to the logical storage units of the Hydrastor grid.
Disk and data deduplication in demand
Disk-based backup is becoming more pervasive in data protection strategies; ESG research finds that the number of organizations using only tape in backup operations dropped 27% between 2008 and 2010, with more organizations favoring a disk-to-disk (56% increase) or disk-to-disk-to-tape (42% increase) strategy. Data deduplication use grew more than 200% between 2008 and 2010.
When EMC launched Data Domain GDA, the company planned to move deduplication "upstream" via integration with EMC NetWorker. Unfortunately, the integration is likely to be hard-coded into NetWorker since NetWorker currently doesn't have an OST-equivalent API -- nor does any other backup vendor product.
It's also unlikely that Symantec will make OST an open standard that other backup vendors could use (similar to how NDMP is utilized by backup vendors to back up filers). Looking ahead, it's more likely we'll see other backup vendors attempt OST-like APIs.
Of course, Symantec charges a fee to test and certify its OST partners' solutions. So it could get expensive for a company like Quantum or Data Domain, for example, to certify its solutions with multiple backup products' APIs. In turn, end users are charged a premium fee for OST enablement -- from both the backup vendor and the target system vendor. In addition, an end user with multiple backup solutions (in this case Backup Exec and NetBackup) is likely to be required to pay license fees to Symantec for OST-enablement with each backup product.
BIO: Lauren Whitehouse is an analyst focusing on backup and recovery software, and replication solutions at Enterprise Strategy Group, Milford, Mass.
This was first published in June 2010