Published: 13 Oct 2003
Two years ago, Barry Brazil got the chance of a lifetime: to build an entire enterprise computing infrastructure for a large organization from scratch.
|Steps toward a networked storage consolidation plan|
Brazil, a consultant, had been hired by Reliant Resources Inc., the newly deregulated energy generation and trading arm of Reliant Energy, in Houston, TX, as its enterprise architect. Because Reliant Resources was to be spun out from its parent, it would need a new IT environment. Servers, networks and 80TB of storage to support 8,000 users would have to be built, tested and deployed in only 14 months.
As a result of Reliant Energy spinning off Reliant Resources as a separate new company, "we got to start with a blank slate," says Brazil. "It gave us the opportunity to really sit down and do this right." On the storage side, Brazil and his team quickly decided doing it right meant building a multitiered, centrally managed networked storage network that could be shared by all of the company's applications. It would provision gigabytes seamlessly and efficiently--the way a utility delivers megawatts.
When in January of 2001, Reliant Resources finally flipped the switch on Brazil's creation, the company's storage architecture came up well short of the virtualized storage utility Brazil originally envisioned. Though Brazil's design relied heavily on Fibre and IP-attached networked storage devices, he abandoned the original dream of a single pool of tiered storage and instead divided physical storage and management into separate silos by application type. Brazil decided to wait for storage nirvana, settling for a five-year plan to get there. Only now is Reliant moving on to Phase 2, breaking down the silos and physically consolidating networked storage from across the company's range of applications.
Lack of management tools
"We decided up front that we would have to start out in siloed configurations based on applications, even though that meant we would end up with a lot of stranded capacity," says Brazil. "In the end, we didn't really have a choice. The tools simply weren't there to allow us to manage across different tiers of storage from different vendors."
Brazil isn't the only one who's been deferring the dream of fully consolidated, centrally managed networked storage utilities. While most enterprises--particularly larger companies--have found it relatively easy to cost justify replacing direct-attached storage (DAS) with a storage area network (SAN) or network-attached storage (NAS) devices, most have done so on only a piecemeal, application-specific or departmental basis. Few have tried to manage those dispersed SAN islands as if they were a single resource. Even though most organizations report marginal benefits from this piecemeal approach--improved utilization rates and quicker provisioning, for example--the proliferation of SAN and NAS devices has increased complexity and driven up storage management costs.
"Over the past 18 months, there has been significant growth in siloed SANs," says Steve Kenniston, an analyst with the Milford, MA-based Enterprise Storage Group. "Every CIO has stood up and beat his or her chest to say what great ... utilization they are getting. Then someone inevitably asks, 'Well, are they all tied together, and are we getting these benefits [uniformly], and is it all managed centrally?' And the answer today is no."
Tying the silos together
Many organizations are now beginning to plot out a roadmap for consolidating their siloed SANs and managing them centrally. But getting there won't be easy. That's because, first, the software standards and tools organizations need to effectively manage heterogeneous networked storage devices in a centralized fashion are just beginning to appear and those that are here don't work seamlessly across all heterogeneous gear. Second, some storage managers say the ROI cases used to justify Phase 1--siloed SAN and NAS deployments--which increase storage capacity almost double that of direct-attached--won't necessarily work for Phase 2. And third, companies looking to manage storage in a consistent, centralized way will need to establish enterprisewide storage policies and practices, and convince independent business units to adopt them. At many organizations, comprehensive storage policies don't yet exist.
Why are organizations such as Reliant willing to buck those challenges and begin moving up the storage hierarchy to the second phase of networked storage? Because, say storage managers, there's plenty of room to improve on benefits that came with the first phase of networked storage deployments. Plus consolidated, centrally managed SAN environments should deliver a new set of benefits. Storage managers, for example, expect to easily perform continuous migration of data to the storage medium that delivers exactly the right cost and performance characteristics regardless of where it is located in the enterprise. They also expect to use centrally managed storage utilities to fine-tune disaster recovery procedures.
Take Phase 2 of Reliant's five-year networked storage plan. By consolidating previously siloed SAN and NAS storage, Brazil says, the company expects to be able to improve enterprisewide storage utilization by at least 15%. Phase 1 of Reliant's SAN/NAS rollout enabled the company to get to approximately 60% utilization, Brazil says. That's well above what most companies see with DAS, which, analysts say averages between 20% and 50% of capacity.
Adding up the savings
Phase 2, now underway, will bring that 50% up to at least 75%, Brazil predicts. What's that mean in dollars? With 250TB of useable storage deployed today, improving utilization by 15% will save the company 37.5TB, Brazil says. According to industry sources, Reliant is spending in the range of $200,000 per terabyte per year. This figure includes purchasing new storage, provisioning it, managing it and depreciating the storage. So for 250TB, Reliant is spending about $50 million a year. Recapturing 37.5TB saves them $7.5 million, according to sources.
In the first phase of its networked storage rollout, Reliant created two separately managed SAN/NAS silos, one for the company's Unix systems--running mostly transaction-oriented databases--and one for its Intel/Windows-based systems--storing mostly user files. Within each silo, Brazil's team created three graduated storage tiers. On both the Unix and Wintel sides, EMC Symmetrix SANs served as the high-performance tier. JBODs provided low-end, entry-level storage for both silos. In the middle tier, however, Reliant selected different storage products for each silo: Hewlett-Packard StorageWorks NAS devices for the Wintel side and Sun StorEdge T3 arrays for the Unix side. Although Reliant originally wanted to use Network Appliance (NetApp) NAS devices as the midtier storage technology for both silos, at the time, Brazil says, the company's software tools weren't up to managing both shared files and databases running on different operating systems.
Phase 2 of Reliant's storage plan--expected to be completed in the third quarter of this year--will break down the silos, physically consolidating Unix and WinTel storage on common platforms at each of the three tiers and will allow Reliant to manage each tier with a single set of software tools. By creating larger storage pools at each tier that are shared by more applications, Brazil says, Reliant expects to improve efficiency and boost utilization. Most of the savings in Phase 2 will come from these efficiencies, he says. Reliant will stay with Symmetrix at the high end.
One of Reliant's tier one applications, is the customer care and service (CCS) utilities application from SAP AG, which Reliant's customer service agents use. The CCS gets Tier 1 status--and uses Symmetrix storage--because it generates high transaction volumes and it needs to be up and running at all times, Brazil says. Service levels for the CCS application call for full disaster recovery and data restore within 24 hours. Besides utilizing the Symmetrix storage, CCS data is replicated twice--once to a separate DMX frame using EMC Symmetrix remote data facility software and again using EMC's business-continuity volume feature. Infrequently accessed Microsoft Word files reside on the JBODs, for example. For the middle tier, Reliant has put out an RFP in search of a single NAS vendor, and is once again considering NetApp.
Next up: central management
Phase 2 will bring Reliant large utilization benefits, but it doesn't represent the storage utility end point that Brazil is aiming for. While Phase 2 will consolidate networked storage at each tier, it won't consolidate management under a single set of storage management software tools.
That's because there's still no single, heterogeneous set of tools that can both aggregate storage resource information and reach down to manage individual storage devices as well as the proprietary tools that come from storage hardware vendors. For that, says Brazil, Reliant expects it will need to wait until 2005, when it launches Phase 3 of its networked storage consolidation plan.
Elevated storage utilization is only one target of companies beginning to push toward consolidated, centrally managed networked storage. Besides being able to use more of the storage space they have bought, organizations such as Deloitte Consulting also see an opportunity to save money by making sure that every file and volume of data is being stored at all times on the storage medium that delivers just the right performance and cost characteristics to match business and application service level requirements.
"We want to be able to dynamically allocate disk and constantly reassign data to different disk using hierarchical storage management software so that we know we are using the most expensive disk for the most important tasks," says Eric Eriksen, CTO, Deloitte Consulting. "And we want to be able to do that across all of our data centers around the globe."
Deloitte operates five data centers, including its main corporate data center in Philadelphia. Historically, each of those data centers has been responsible for running regional applications and managing its own storage. In 2000, however, Eriksen embarked on a two-pronged consolidation strategy. Part 1 involved pulling operation of some enterprise applications and storage into the main Philadelphia data center. Part 2 linked Deloitte's three largest data centers with a high-speed hub-and-spoke network and deployed standard storage management tools and processes throughout.
"We've been on a two-and-a-half-year quest to stamp out individual solutions in the different data centers and to do everything--including managing storage--on a worldwide basis, making the global data center the master," says Eriksen.
So far, Eriksen has been partly successful. He's consolidated some applications into the Philadelphia data center, but he hasn't been as successful at finding HSM and other tools that can be used to efficiently migrate storage throughout his mixed-vendor, worldwide network. As a result, he says Deloitte still doesn't have global visibility--much less seamless management--of its storage resources.
Currently, Eriksen is using Veritas SANPoint Control s software to monitor utilization and other aspects of his heterogeneous SAN environment, and to do some basic storage allocation and configuration. But Eriksen says, SANPoint Control is far less useful than native device managers for low-level device-specific management tasks such as LUN masking. So in most cases, Eriksen has been forced to license and use hardware-specific native tools as well.
"We'd like to be able to abstract everything to a common tool like SANPoint Control, Eriksen says, adding, "so far it has not proven to be able to do all the things we need." He thinks as industry standards become more established; in two years, vendors will deliver a more robust set of management tools.
Many managers see another benefit to consolidating networked storage: improved data availability and disaster recovery. Three years ago, when First American Trust Federal Savings Bank, Santa Ana, CA, deployed 350GB of NetApp NAS storage to replace direct-attached disks storing the company's Word, Excel and other user files, CTO Henry Jenkins decided almost as an afterthought to use NetApp's snapshot technology to back up files on a regularly scheduled basis.
"We figured if somebody screwed up an Excel or Word file, we could go back to the last hourly snapshot and maybe retrieve it for them," says Jenkins.
As the bank expanded the use of the NAS --first replacing DAS on Linux servers, and eventually using NAS to store SQL Server database volumes generated by critical CRM applications--First American Trust began to rely on networked storage to guarantee data availability.
On its Linux servers, First American clustered the NAS storage to protect against storage-related failure. And the company supported the database applications with a hot-spare setup because SQL Server couldn't be supported with NetApp's clustering implementation. Jenkins says that's because SQL Server clustering at that time required a quorum drive.
First American recently updated its disaster-recovery plan, placing NetApp's clustering and SnapMirror technologies in a central role. The company built a back up data center in San Diego--connected to the Santa Ana data center via a T1 line--with fully mirrored NAS storage that's updated on a regular basis using SnapMirror. Altogether, Jenkins says, 85% of First American Trust's enterprise data--1.25TB in all--has been consolidated onto the clustered, mirrored NAS setup.
Now, says Jenkins, First American is planning to migrate remaining legacy systems not supported by NetApp--such as its AS/400-based trust accounting system--to a Windows environment in part so it can be incorporated into the disaster-recovery setup.
"At first, availability and disaster recovery weren't a big part of justifying the storage consolidation," says Jenkins. "Now they're a very important part of the equation."
An important and often overlooked first step, say storage managers and analysts, is to create consistent storage operational practices and processes and make sure they get adopted across the enterprise. At many companies, storage and network administrators at different sites often use different volume and file naming conventions. In addition, rules aren't consistent regarding which types of data are assigned to what class of storage, when additional storage capacity is provisioned and when or how data is archived.
Says Deloitte's Eriksen: "We need not just standard tools, but also processes so that it all looks and feels the same."
Managers at the $27-billion defense contractor Lockheed Martin, headquartered in Bethesda, MD, have come to the same conclusion. At Lockheed Martin, as part of a three-year effort to roll out a centrally managed SAN-based storage infrastructure, Stephen Hightower, director of infrastructure services, has overseen the development of new storage operating and storage delivery models.
A common storage model
The common operating model defines a standard menu of storage services available to business managers and the various technologies that are to be used to deliver each level of service. As part of that definition, the number of primary storage vendors used by Hightower's group is being reduced from six to two. The service delivery model defines how Lockheed Martin's storage group will be organized--and where and how storage resources will be deployed--to deliver the service levels required by the business at the lowest cost.
"It looks at things like how do we consolidate storage regionally and manage it centrally," says Hightower.
Lockheed Martin is still developing both models. They are expected to be deployed in the company's primary data center in Sunnyvale, CA, by 2005 and are also available to managers of data centers run by other Lockheed Martin business units.
Experienced consolidators say it's also critical to take a comprehensive inventory of the storage service level requirements of all key applications. That's because as management of SANs and other networked storage resources are consolidated, more applications will be competing for the most desirable storage.
"You need an objective way to decide what tier of storage each application should have based on its performance, availability, recovery needs and how fast its storage needs are growing," says Reliant's Brazil. That's what his team created in Phase 1. After developing a service level profile and storage cost estimate for each application, Brazil asked business owners which tier of storage they wanted and would be willing to pay for. Usually business managers agreed with his team's choices.
The service level definitions originally developed by his team at Reliant will now be used to decide which data goes where once Phase 2 goes into place.
The success of a networked storage consolidation projects often ride on the types of applications and data you first select for consolidation. First American Trust, for example, started with Word, Excel and other user files which, if lost or compromised, would not have threatened the company. Those files represented about 15% of First American Trust's total storage.
And when he decided to expand use of the NetApp NAS devices, First American Trust's Jenkins began with a brand new Linux-based wire transfer application. "It was perfect because it didn't have any legacy storage attached to it, but it was something that we really didn't want to go down," says Jenkins. "So we were able to justify linking it into the NAS network by focusing on the clustering capabilities."
Only once he'd successfully consolidated user files and data from the wire transfer application onto NAS did Jenkins attempt the SQL Server database volumes. And again, Jenkins sold the networked storage consolidation project as a key piece of a new, companywide disaster-recovery project.
"Justifying storage consolidation is a whole different ball game compared to justifying initial SAN or NAS deployments," says Carl Greiner, infrastructure services senior vice president at Meta Group in Stamford, CT. "The smart thing to do is to tie it to a larger corporate initiative like disaster recovery."
Even for companies selecting the right projects and taking pains to establish in advance standard storage processes and service level definitions, getting to a consolidated, centrally managed networked storage will be a huge challenge. And the biggest roadblock, say many managers, is the lack of robust, heterogeneous storage management software tools.