The role of WAFS
Wide area file services (WAFS), a generic term sometimes used interchangeably with wide area data services (WADS), wide area application services (WAAS) or wide area data management (WADM), supports storage consolidation by providing remote users with real-time, read-write access to corporate applications and data directly from the data center.
There is nothing new about remote applications, but heavy latency and limited WAN bandwidth often restricted their efficiencies. WAFS avoids latency and bandwidth constraints by caching content from the data center at the remote office. Caching normally involves placing a central copy of the file on a local WAFS appliance where it can be accessed and edited by remote users. Changes to the cached data are then periodically synchronized with the data center. The biggest challenge for caching is to coordinate concurrent updates to prevent multiple versions of the same file.
WAFS products can exist as hardware or software. When implemented as software, such as the WAFS software from Availl Inc., an agent is installed at both the data center and a dedicated server at the remote site. With hardware, like Riverbed Technology Inc.'s Steelhead appliances, a dedicated appliance is installed at both ends of the connection. Although software implementations cost less up front, the ongoing maintenance and need for operating system (OS) support have made hardware deployments a bit more attractive -- especially when numerous remote offices are involved.
WAFS product selection should involve a careful consideration of bandwidth and cache size. For example, Riverbed's Steelhead offers a WAN capacity of 2 Mbps and a disk (cache) size to 80 GB. By comparison, Riverbed's 6020 supports a WAN capacity of 310 Mbps and a cache size to 1.4 terabytes (TB). WAFS products selected for enterprise use may require RAID to protect cached data against disk failures. Also consider the protocols and applications that must be supported. For example, if you require NFS access from remote sites of centralized consolidated storage, you need a WAFS platform that accelerates or mitigates the latency of NFS, along with bandwidth optimization for large data files.
WAFS relies on a WAN connection to exchange data between the data center and remote offices, and this makes WAN bandwidth a crucial consideration. Bandwidth is very expensive, so rather than shouldering the costs of a larger WAN "pipe," WAFS vendors typically integrate one or more WAN acceleration features into their products. The two common features of WAN acceleration are compression and data deduplication.
Compression applies a mathematical algorithm to data and replaces redundant sections of a file with small tokens that are then reconstructed on the destination end. Not all files compress the same way, so compression efficiency varies depending on the type of file being sent. For example, Word documents may compress very well, while image or audio data may compress very little -- if at all. Data deduplication (also called single-instance storage or intelligent compression) eliminates redundant files, blocks or even bytes. Rather than caching several copies of the same data, only one unique copy of the data is cached. This can significantly reduce bandwidth requirements. See the article Data deduplication explained for more details.
WAN acceleration can also speed data transfers by reducing the number of handshakes involved in the exchange. For example, it may take hundreds of handshakes to successfully open a Word document, and each handshake takes a finite amount of time to acknowledge, so reducing handshakes can significantly reduce application latency -- making remote applications far more responsive.
Plan for WAN disruptions
How long can you keep a file at the remote office before it must be resynchronized with the data center? Any storage consolidation plan for remote offices should also include a contingency plan to address the WAN disruptions that will inevitably occur. In many cases, caching is sufficient to guard against short WAN disruptions, but it's important to consider the impact of longer disruptions due to system failures, disasters and other circumstances.
This was first published in January 2008