When it comes to data storage, there is no such thing as enough.
Just ask Marshall Gibbs, director of IT for Information Resources, Inc. (IRI) (pictured at left). He manages the data warehouse and mining operation at IRI, a provider of consumer intelligence to companies in the pharmaceuticals and consumer packaging sectors. With 122TB of active data residing on the company's SAN, Gibbs is challenged to deliver more data, more quickly and reliably, than ever before. The stakes are high for both IRI and its customers."These are multi- multi-million dollar business decisions they are making based on insights we provide," Gibbs explains. "It's very mission critical information to them. And they will tell you and Wall Street will tell you, if the data goes dark for a little while with one of these retailers, their stock tanks."
|Live Webcast with Enterprise Storage Group's Steve Duplessie on the EMC Symmetrix 6 Announcement|
But the problem hasn't been deploying enough storage to handle fast-growing databases. It's been getting that data and analytics to customers quickly enough to deliver a competitive advantage.
"The Symmetrix [8530 units] have been extremely solid for us in terms of overall performance, but the fact is we couldn't get enough data off them fast enough," says Gibbs, adding "we've been forced to spread storage across multiple chassis to get enough throughput."
Instead of waiting for a successor to the fifth-generation Symmetrix, many storage managers have bought from Hitachi Data Systems (HDS) and IBM, causing Symmetrix users to ask whether the vaunted "Symm" was still the Rolls Royce of storage. Would they get better performance from the HDS Lightning 9900 V, with its cross-bar architecture? Should they bring in the IBM Enterprise Storage System for its better pricing?
A first look at the new Symmetrix suggests that EMC has dramatically improved performance and value. The buzz over the last year has been that "Symm 6" would be nothing new, but it is, in fact, a radical departure from any existing storage subsystem architectures.
King of the hill
This month, EMC moved to stop the bleeding, unveiling its sixth-generation Symmetrix under the moniker Symmetrix DMX. The new array has an interconnect architecture--called a direct matrix--that uses up to 128 point-to-point connections between cache memory and the front-end and back-end controllers, an approach that not only eliminates the bottlenecks of the previous Symm's bus architecture, but seems to outpace advanced switch-based solutions.
"From a hardware perspective, this is a major architectural change," says Robert Passmore, research director at research firm Gartner. "This is a major leapfrog over the competition in terms of scalability and performance, and it is much needed."
The specs tell part of the story. Symmetrix DMX is rated for 70.4GB/s of peak internal bandwidth, a huge leap over the Symmetrix 8000's 1.6GB/s rating. It also boosts cache throughput to 16GB/s from 6.4GB/s, and back-end I/O to 12.8GB/s from 5.1GB/s. Factor in higher drive parallelism through the use of up to 64 drive loops over 2Gb Fibre Channel connections, and EMC claims that Symmetrix DMX has data transfer rates that are five or six times higher than that of the 8000 family.
For now, the Symmetrix DMX is king of the performance hill. It vaults over the published specifications of the HDS Lightning 9900 V family, with four times the rated internal bandwidth of the high-end HDS Lightning 9980 V (15.9GB/s). The Symmetrix DMX supports up to 128GB of global memory and 32 concurrent cache regions, vs. the 9980 V's 64GB and four concurrent cache regions. On paper, these specs give Symmetrix DMX a big edge in environments where responsiveness is limited by bandwidth.
The specs also give EMC a leg up in the high-end storage array market.
"For the last couple of years, Hitachi has done an excellent job marketing themselves as a superb technology against an aging Symmetrix architecture," says Tony Prigmore, senior analyst for industry research firm Enterprise Storage Group. "This announcement shifts that."
Risk and reward
Building the new Symmetrix around a new architecture was a big risk for EMC. EMC Vice President, Chuck Hollis, says the company searched for two years for technologies to replace the old Symmetrix. In 2000, it committed to a matrix over switch architecture, setting EMC up for some hard times in 2002.
"We could have had a bus-switch based hybrid out a couple years ago," says Hollis. "We knew that 2002 would be a tough year in the high end. We knew that we'd be late in the market and that we'd have to play defense out there."
A May 2002 report from industry analyst AG Edwards warned that "EMC must aggressively increase the functionality, capacity and performance of its solutions to remain competitive." With the matrix architecture, EMC hopes to do exactly that.
The direct matrix architecture of the Symmetrix DMX (left) does away with intervening controllers like those in a crossbar-switch architecture to enable a host of direct, point-to-point connections between cache memory and the front-end and back-end controllers. The result is 64 GB/s of aggregate internal bandwidth--a quantum leap higher than the Symmetrix 8000's.
Here's the Cliff Notes: The matrix backplane provides a dedicated, physical link between each cache controller and every disk and front-end controller.
"When [data] comes in over a bus, you have to ask questions about where it's coming from and where it's going," says Hollis. "On the matrix it's all path to path. If it's on Wire 7, there is only one place where it can be coming from and one place where it can be going."
The hardwired interconnects help eliminate bus or switch controller latency and enable multiple parallel transfers that approach wire speed, says Hollis. This architecture enables controller intelligence to be distributed around a globally shared cache, an approach that until recently would have been too expensive.
"The concept of the matrix has only in the last few years been technologically feasible," says Passmore. "What EMC has done is move the electronics from the internal parts of the switch and move them out to the edge. The first question you have is: How in the world do I afford all those drivers and receivers? And the answer goes back to silicon design and components."
Published Symmetrix DMX benchmarks simulating a data warehouse workload may set high expectations among customers. According to EMC, a high-end Symmetrix DMX2000 array delivers 20,000 IOPS with a sustained response times of just over 3.5 ms. A comparable HDS Lightning 9980 V was benchmarked at about 6,000 IOPS with sustained response times just under 3.5 ms.
High-flying specs may grab headlines, but most IT managers are more interested in flexibility and cost questions. Symmetrix DMX offerings range from the low-end DMX800, which shares a modular cabinet design with EMC's midrange Clariion, to the Symmetrix DMX2000-P with 128GB global cache and 64 disk channels (see "Symmetrix DMX Model Offerings"). ESG Analyst Prigmore singles out the Symmetrix DMX800 as a potent offering that will offer performance and scalability on par with the Symmetrix 8830 while offering a lower entry price point.
"We expect to see the DMX800 go into places where Symmetrix has not been in the past," Prigmore says. "We see the DMX800 as having the most potential."
The DMX800 fills the space just above EMC's Clariion line, making enterprise-class business continuity features economical outside of large data centers. A company could deploy Symmetrix DMX800 units in regional offices, for example.
At the heart of the DMX family are the Symmetrix DMX1000 and DMX2000 products. The DMX1000 supports up to 144 disk drives, 48 front-end ports, 64GB of global cache memory and 18.4TB of usable capacity. The dual-cabinet DMX2000 doubles each of these specifications. EMC expects many DMX units to be configured for parity RAID operation, which protects data while maximizing disk utilization. Optimized RAID operations in the new DMX models help these systems approach the performance of mirrored configurations, but yield 75% more usable capacity.
While pricing was not available at press time, EMC says the list price for a DMX1000 with 12TB of usable capacity should run about one-third less than the price of high-performance Symmetrix 8830 sold last year. And who pays list these days for storage?
EMC plies the most demanding customers with the performance-minded DMX1000-P and DMX2000-P. These models share cabinet configurations with their non-P counterparts, but enable higher back-end bandwidth with double the number of drive channels. Where the DMX1000 and DMX2000 employ 16 and 32 2Gb Fibre Channel drive channels, respectively, the DMX1000-P and DMX2000-P are equipped with 32 and 64 back-end disk channels apiece. High-performance models will typically be configured with mirrored disks for maximum responsiveness. With disk mirroring, usable capacity maxes out at 10.5TB or 21TB, 43% less than on the non-P versions.
Unfortunately for mainframe users, Symmetrix DMX doesn't yet support FICON. The Symmetrix 8000 family has included FICON Fibre Channel ports for some time, but DMX models won't gain FICON capability until the third quarter of 2003. The delay poses an early challenge for the promising new architecture. Says Hollis: "This was the most important feature I wish was in the box at launch that wasn't."
ESCON, however, will be available for DMX at launch. EMC will install Symmetrix 8000 units for DMX customers until FICON-capable systems are available. For installed DMX units, the company will provide FICON daughtercards. Whether users will be willing to accept a delay remains to be seen.
A change of heart
The Symmetrix DMX represents an important--and overdue--shift in EMC's market approach. Gartner's Passmore singles out EMC's practice of maintaining high list prices and using aggressive negotiating tactics.
"The [Symmetrix] 8000 series was kind of a one-size fits-all product line. Even though they had different models, the salesmen usually jumped in to sell the biggest box," Passmore explains. "They are a much more enlightened company today. They've been listening to the users and listening to the analysts, and the message has been to stop playing games."
Nowhere is the change more evident than EMC's new disk drive pricing structure. Before, Symmetrix disks sold at an exorbitant premium over the off-the-shelf 10,000rpm SCSI disks used in the Clariion. Today, the Symmetrix DMX and Clariion lines use the same standardized disk drives.
"You could pay anything from $11,000 to $18,000 dollars for a [73GB] drive," says Passmore. "If you bought a small handful of drives for the Clariion today, you'd certainly be talking under $1,500."
The fact is, EMC had to do something about its pricing. Cash-strapped businesses have seen IT budgets slashed, and IBM remains a keen price competitor, Passmore says. EMC had already begun aggressively discounting its gear in advance of the DMX release, and that trend should continue. Gibbs says that IRI, for one, has welcomed the change.
"Across my organization, with some very experienced data center folks, they will tell you to a man that the way EMC is going to customers is changing, and significantly for the better."
But other challenges await EMC, Prigmore says. In particular, it must convince IT shops that it is more than a hardware provider, even as it sells a new class of hardware to customers.
"It is easier for the company to train and educate their clients on the new Symmetrix family than to orient them on new software," says Prigmore. "In order for the ship to turn, [EMC] must do both."
For more than a year, EMC has argued that the aging bus architecture in the Symmetrix 8000 line had plenty of headroom to deliver top-line performance. No surprise, the tune has changed at EMC. The question is: Where does Symmetrix DMX go from here?
"Even if technology innovation stopped today, we could improve performance," EMC vice president Chuck Hollis says.
Today each I/O director in the Symmetrix DMX employs eight dedicated matrix links, which Hollis says can be doubled to 16 links without changing the surrounding architecture. Total cache can also be quadrupled, yielding a 512GB global cache that is four times the size of the largest cache currently specified for the DMX2000 line. Finally, the architecture is designed to enable drive counts well beyond the current 288 limit.
"Two-thousand and forty-eight disk drives--that's an exercise in sheet metal. How big do you want it?" asks Hollis. "We think the market will want that class of configuration over the next few years, as they learn that [performance is] disk drive limited."
EMC readily admits that the new architecture leaves plenty of room open for software optimizations. Hollis singles out pipelining that would enable cache cards to read data off one end while writing on the other.
"We're getting very good performance today, but as you learn about an architecture, there is more to do," says Hollis. "We could do a performance roadmap for the next 18 months just on software updates."
Stay the course
One thing EMC couldn't afford was to compromise its strong suite of software and tools. The Symmetrix DMX runs an updated version of EMC's Enginuity operating environment as a common platform for EMC software--including SRDF and TimeFinder--that runs on old or new Symmetrix equipment. The new Enginuity code, says Hollis, focuses on managing changes in the way the matrix architecture transacts data.
"The first thing is all the bus arbitration logic goes away," Hollis explains. "The second thing is that we had to teach [Enginuity] about increased parallelism. We used to have a small number of buses. Now we have eight channels per board. We went from 16 memory regions to 32."
Like any operating system, Enginuity schedules and manages the flow of data through Symmetrix. Enginuity contains the CRC error-correction code, for example, that ensures end-to-end data integrity, and performance optimizing algorithms. EMC also tuned disk mirroring and parity RAID operations, resulting in parity configurations that nearly match the performance of mirrored disks on the Symmetrix 8000, it claims.
Of course, EMC must ensure that the Symmetrix DMX line maintains the highest level of reliability. "[DMX] goes into a customer set that has no tolerance for defects and failures," says Passmore.
Steps EMC has taken to ensure reliability include the use of triple module redundancy with majority voting (TMR-MV) to verify the proper operation of key components in the infrastructure. TMR-MV removes a single point of failure by requiring that at least two of three modules produce identical output. If one module is out of agreement, its input is rejected and the module is flagged as faulty. The three modules link to redundant voters, which route data to cache. If one voter goes down, the second remains online to route the output.
Gibbs says IRI has had no trouble ramping up the new equipment alongside the existing Symmetrix 8730 arrays. What's more, he says the new equipment is proving more flexible and easier to configure than the existing gear, singling out dynamic handling of drive geometries and disk volumes in the DMX product. A switch from a disk mirroring configuration to parity RAID, for example, was seamless.
"The nice thing about the [Symmetrix DMX] is that it is finally approaching the storage-on-demand model, where you can truly add storage and capability as needed," Gibbs says. "In the [old] architecture, once you have the configuration laid out, incremental change in the storage throws it all out of whack."
It's still early for the Symmetrix DMX, and ultimately, its success will be determined by its ability to solve user problems. But the early returns are promising. After a long, long wait, it looks like the Symmetrix is back.
"The matrix architecture is elegantly simple," says Passmore. "The best inventions have simplicity to them and I think this is one of those things."