Much has been said about the process of storage consolidation, but for a large geographically distributed enterprise it has been easier to discuss than achieve. The benefits are obvious in that if a corporation can bring most of their storage back to a central location, the cost of managing that storage goes down and the utilization of that storage goes up.
However, there have been two key issues with achieving this goal.
The first challenge on the path to storage consolidation was that of creating a storage pool large enough to meet the demands of a geographically distributed enterprise. Today, the increase in disk storage density and new fast fiber technologies have allowed storage vendors to create storage products (EMC's Symmetrix, Hitachi's Thunderbolt and BlueArc's Silicon Server) that are at last able to support the speed and scalability required to meet this need.
The second hurdle has remained with every enterprise since the early days of WAN deployment. However much empty fiber does or does not lie buried in the ground, the ability to move large quantities of data between geographically disparate sites still remains elusive, caused in part by the cost of large data pipes and in part by the unavoidable latency in any long distance data transfer. Anyone who has tried to access data stored at other geographic locations in their enterprise has probably experienced this kind of frustration for themselves. The problem is especially disruptive for enterprises like Broadcom who have a number of regional engineering offices with a requirement to share and interoperate on design data between sites. Trying to do this kind of distributed design over a wide area is certainly frustrating, if not impossible, today.
These kinds of bandwidth and latency issues are analogous to the early days of the Web. In order to speed up Web transfers, companies like Inktomi and Network Appliance created caching appliances that could hold copies of Web pages and were placed (geographically speaking) much closer to the end user. This allowed for faster page retrieval and an overall better browsing experience. Today, we take this kind of caching as a given in our Web browsing experience.
The challenge of using technologies like these Web caches for enterprise file caching has been around for a long time. However, enterprise file caching in this manner involves a far larger problem than Web caching: Enterprise file protocols like CIFS and NFS are much more complex than HTTP, and data needs to not only be read (as Web pages are) but also often needs to be written as well.
Despite these challenges, a new generation of products is emerging from companies like Tacit Networks and DiskSites that, for the first time, allow file caching to be as transparent as Web caching. These new file caches will fetch files from a centrally-located storage pool out to remote offices where they are installed allowing the users there to read, modify and create files with the same kind of performance they would see if they were at the same location as the storage pool. These clever caching devices also offer compression technologies that make it much quicker to move modified files out of the cache and back to the main storage pool.
Through the installation of a central cache manager (located with the main storage pool) and sophisticated distributed data management schemes, the file caches allow for users at remote sites to modify files without any risk that users on other remote sites or at the head office will be modifying the same file at the same time. Logically, the file never leaves the main storage pool.
Some of the benefits of this type of technology are obvious, in terms of the enhanced speed and usability they bring to distributed file operations. Other more subtle benefits may well change the way distributed office data storage is designed in the future. By effectively allowing fast file access to head office data, file caches effectively remove the need for smaller NAS or file servers to be distributed at remote offices. This reduces the capital expenditure and management costs required for these devices.
Another bugbear for system administrators managing remote office data is that of backup. If no qualified IT users are present on the remote site, then how can that remote data be ensured to be kept safe? Again, the file cache removes this issue by storing files back to head office consolidated storage pools where they can be backed up as part of the normal data protection routines there.
All in all, it seems that the new breed of file cache appliances are likely to change the way that smaller remote office data storage is handled in the future, as well as enabling a new wave of distributed data sharing and cooperation by allowing more and more data to be stored in a centrally consolidated data pool.
This, hand in hand with the new generation of Web services, should make the headache of distributed data management a lot easier to deal with.
Copyright 2002, Blue Arc Corporation.
This was first published in June 2002