Managing and protecting all enterprise data


HDS reinvents high-end arrays

Hitachi Data Systems' groundbreaking new arrays don't just offer eye-popping capacity and performance--they can also virtualize petabytes of external storage.

TagmaStore Universal Storage Platform configurations
When a major hardware storage vendor says it has big news, you expect bigger iron with eye-popping specs. In that respect, Hitachi Data Systems (HDS) doesn't disappoint with its new TagmaStore Universal Storage Platform (USP) product line. The three new array models--soaring to a maximum capacity of 330TB--dwarf the 84TB top ends of EMC Corp.'s DMX 3000 and IBM Corp.'s ESS 800's 56TB and span and surpass HDS' current Lightning 9970V and 9980V products.

Performance, too, has been enhanced, to the point where the entry-level model of the new line matches that of the current high-end Lightning 9980V. HDS' HiStar architecture has been further refined--and redubbed the Universal Star Network, or U-Star--with reworked caching designs that yield substantial improvements in cached bandwidth, number of data paths and cache memory operations, among other parameters.

But bigger and faster isn't the whole story. Besides delivering high-performance storage, the three USP models can act as a front man to other external storage devices and virtualize those boxes to create a single pool of storage. HDS says that up to 32 petabytes (PB) of external storage can be managed in this manner, all under the single umbrella of the company's HiCommand storage management software and other application suites.

"The sheer scalability, the performance, the number of IOPS, the amount of storage it can manage behind it," says Tony Asaro, Enterprise Strategy Group's senior analyst, "all those things make it compelling."

The capabilities of the new HDS arrays have piqued the interest of Gary Pilafas, manager of enterprise architecture and infrastructure for United Airlines in Schaumburg, IL. With approximately 200TB of installed storage--150TB of which is HDS--Pilafas' shop was one of a handful of companies that took early delivery of USP arrays. Pilafas plans to rigorously test the modestly configured 5TB USP to compare its performance to an HDS Lightning 9960 and a 9980 United has installed. Pilafas also hopes to use the USP box to consolidate other storage: "We have some older EMC technology that I would like to virtualize," he says.

The significance of the USP line is apparent, and it blazes a trail for where enterprise-class storage systems are headed. Virtualizing already-installed storage isn't new, but doing it from a high-performance array and managing it all from a single pane of glass clearly raises the bar, and redefines the role of a high-end storage platform. "This is the first time we've seen a major supplier take their controller architecture and support a heterogeneous environment," says John McArthur, group vice president of worldwide storage research at IDC.

But before anyone gets too carried away with all of the superlatives, not all of USP's announced functionality is available out of the gate. Initially, the top-of-the-line USP1100 has a maximum raw capacity of 165TB, which HDS says will increase to 330TB when the Hitachi Ultrastar 10K300 300GB disks, announced last February, become an option in December.

At this time, virtualization capabilities are somewhat limited, too. Initially, the external boxes that a USP system can manage are restricted to other HDS arrays; the company plans to add support incrementally for other vendors' storage systems. HDS says the first enhancement will come in December, but Claus Mikkelsen, senior director for storage applications for HDS, says the specific storage systems that will be supported haven't been determined yet, however, support for EMC DMX and IBM Shark arrays are "high on the list." Mikkelsen adds that there are "a lot of service and maintenance issues that need to be worked out for each of the different vendors' products." HDS says customer demand will drive which devices are supported first.

The TagmaStore external virtualization capabilities are available as a separately licensed option. And regarding virtualization, HDS isn't likely to be alone for long. "They're not going to be the only one," says John Webster, senior analyst and partner at the Data Mobility Group. Webster says other vendors will soon introduce similar products that are "robust ... not the little appliances built on Windows NT virtualization machines."

HDS USP1100 vs. Lightning 9980V

The product line
There are three models in the USP line--the USP100, 600 and 1100. Their architecture is similar and they range from one to four frames. The entry- and mid-level versions are upgradeable and can ultimately be configured to the capacity and performance level of the top-of-the-line USP1100 model. Disk options for all three USPs include 73GB and 146GB drives, with the 300GB disk expected to be available before the end of the year.

All are built around the U-Star architecture--previously known as Hi-Star--the key to HDS' claims of vastly improved performance. Like earlier incarnations of HDS' crossbar switch architecture, the U-Star architecture connects front-end ports to back-end controllers and disks to cache. This iteration includes upgraded processors, and features twice the number of data paths compared to the current Hi-Star implementation in HDS' Lightning systems; U-Star can also handle four times the number of concurrent cache operations. HDS says these improvements represent an expansion of the existing architecture, which has the capacity to be enhanced even further. (To see some of the key metrics comparing the performance of the USP1100 to the Lightning 9980, see sidebar.)

The total bandwidth refers to both the cached bandwidth and control bandwidth. For example, the USP1100's 81GB/sec total bandwidth includes 68GB/sec dedicated to data, with the remainder--13GB/sec--dedicated to control functions. Total bandwidth ratings for the USP100 and USP600 models are 23.5GB/sec and 40.5GB/sec, which also represent substantial improvements over the Lightning 9980V. HDS says all of this adds up to approximately four times the performance of comparable Lightning models. "It's about 4X in everything," says Mikkelsen. "It's going to scale pretty high."

To provide some additional context for these figures, EMC's DMX 3000, with its Direct Matrix Architecture, boasts 128 direct data paths, 32 of which can operate concurrently at 500MB/sec each, yielding a total throughput of 16GB/sec.

All of the USP1100 specs add up to performance that HDS says tops out at almost 2 million IOPS. If that figure bears the scrutiny of real-world testing of production units, it would give the USP1100 a substantial advantage--at least on paper--over the competition. But array performance statistics can be misleading, and will vary significantly, depending on testing environments. Testing with standard benchmarks such as those provided by the Storage Performance Council (SPC), might yield comparable performance numbers, but some vendors, such as EMC, keep performance numbers under wraps. Still, even under controlled conditions, comparing storage arrays' performance is a dicey affair.

Two million IOPS may, indeed, push the bounds of credibility, and HDS concedes that it's a theoretical limit. But the company says it isn't alone in the industry when it comes to publishing theoretical performance numbers. "I think all the storage vendors post those," says Mikkelsen, adding that practical limitations depend on many variables in specific storage environments and a realistic limit is "all over the map."

Operating systems
supported by USP

Storage pools
The virtualization part of the TagmaStore story is just as compelling, and in the future, it may be even more important than the spec sheets for the USP boxes. Any of the three models can virtualize up to 32PB of external storage. While 32PB may seem like a wildly outlandish number, HDS insists that's it's within the realm of reality. But with support for about 16,000 addresses, Mikkelsen says, "You'll run out of addressing capability before you run up against the 32PB limitation," as 32PB is a lot of storage to virtualize. IBM's SAN Volume Controller (SVC) offers similar virtualization features, and currently supports storage from other vendors, something HDS says the TagmaStore boxes will do in subsequent releases. SVC running in a Cisco MDS 9000 series director has an upward limit of 2PB of storage that it can pool, but in a recent webcast, a Cisco spokesman said that the 2PB is more a theoretical, than practical, limit.

In HDS' virtualization scheme, all back-end--or virtualized--storage would be represented as USP LUNs, and all connected storage would be managed from a single HiCommand console.

While the USP systems can pick up configuration and volume information from external storage, HDS says the process will be more of a volume migration between the external storage and USP platforms.

For more efficient administration of large installations, up to 32 Virtual Private Storage Machines can be created to carve out more manageable chunks of the pooled storage to allow multiple system administrators to oversee segments of the entire pool. Overall administration can still be centralized, but each administrator of a virtual machine would be able to manage physical disk space, cache and front-end ports. This capability may be especially useful when consolidating storage among divisions or of acquired companies.

"Having external virtualization provided by a storage system allows for another tier of storage, and a system that can act as storage and a virtualization platform allows for even greater consolidation," says ESG's Asaro.

Because all of the back-end storage connects to the USP system, the latest HDS software tools would be available to manage that external storage, effectively upgrading the management capabilities of the attached boxes. All writes--whether directed at the USP storage or external storage--are cached in the USP array. Because the USP system will likely be the best performer among the pooled systems in this scenario, HDS says that write-access performance to the external storage should actually improve, especially for SATA/ATA disks, which could approach the performance of Fibre Channel (FC) drives. "If you're using [SATA drives] for lifecycle management, probably 98% of the I/O that goes to the SATA drives are writes, and writes go to the [USP] cache," says Mikkelsen. If the back-end box already uses write caching, it will work in conjunction with the USP system.

As more external storage is added to the USP's virtual pool, the USP's performance will be affected. The type of storage will have a bearing on performance as well. "If you start putting a lot of active storage out there, let's say you backend it with 9980Vs or DMXs," says Mikkelsen, "that's going to be eating into the bandwidth of the [USP]."

An upcoming enhancement to HDS' Data Retention Utility, formerly called LDEV, will make it possible to lock down retention data without having to format the disk it resides on as write once, ready many (WORM). Data Retention Utility makes WORM an attribute of the data, so the retention policies set for the data will follow it wherever it's moved within the USP-managed pool of storage.

Reads to all connected storage are cached, using read-ahead technology to enhance performance throughout the pool. While initial read requests have to wait for disk access, read-ahead anticipates additional requests and accesses appropriate blocks of data which can then be accessed at memory speed. With this method, there may be a performance improvement with connected SATA/ATA disks; for FC disks, HDS expects any performance hit to be minimal at best. Read-ahead technology is commonly used in many vendors' storage arrays.

The TagmaStore arrays also add cache partitioning, which was not a feature of the Lightning line. Partitioning cache helps ensure that unruly applications don't overrun cache allocated to other users and adversely affect their performance. HDS says that cache partitioning will make it possible to guarantee quality of service throughout that cache.

Replication enables tiering
By pooling different storage resources, USP's virtualization capabilities will help make effective use of older and lower performing storage in a tiered storage structure. Tiering the pooled storage will make it possible to implement data lifecycle management, and enhance regulatory compliance efforts. With support for a variety of open systems and mainframe operating systems, it would be possible, for instance, to give mainframe computers access to lower cost SATA arrays.

United's Pilafas is paying particular attention to TagmaStore's virtualization and replication functionality to enhance his company's storage tiering. "Information lifecycle management is something we need and would like to do," says Pilafas. "It's something we'd like to do based on the capabilities of the new box."

The key to data movement in the USP virtualization architecture is HDS' new Universal Replication Engine, which will be in place with the rollout of additional software components, planned for December of this year. This will provide a single replication application that works with the USP system and across all connected storage. Data can be moved internally within a USP box, from a USP system's internal storage to an external storage device or among external devices. And again, because replication is controlled at the USP level, there's a single interface to manage data movement among disparate devices.

According to HDS, the replication engine offloads the burden of typically resource-intensive asynchronous replication to lessen the impact on hosts and network resources. HDS cites some specific techniques they employ for reducing the performance hit.

Replication services on the USP systems are journal- based. The journal resides on disk, rather than in cache, using a two-level striped architecture that HDS says will provide performance that's nearly as fast as cache. The journal helps avoid overtaxing primary storage and other network components and provides assurance that updates will be properly handled. For example, if bandwidth is exceeded during the replication process, the updates will continue to be written to the journal until enough bandwidth is once again available. HDS claims that with the journaling method, the replication engine can withstand long link outages, and it's more effective than the more common approach of using a bitmap to keep a record of the data tracks that need to be updated.

Most replication products use a "push" technology, which puts the burden of the replication process on the primary array. The Universal Replicator Engine, like IBM's XRC, uses pull technology to offload that burden. With the journal residing on the primary array and reader tasks on the secondary storage handling the changed data, performance degradation on the primary is substantially reduced. Bandwidth, too, is conserved. Typically, bandwidth for replication is allocated based on peak I/O activity. But because the journal will absorb the peaks, HDS suggests that required bandwidth can be reduced by as much as half.

The same journaling process can help diminish geographic issues for long-distance replication from one USP box to another, which would make it effective for disaster recovery. Using USP systems at separate sites makes it possible to provide disaster recovery for multiple data centers.

New model for enterprise storage
The USP line offers an easier upgrade path than previous HDS storage products, HDS claims. Users may opt to start at the low end of the line with a minimally configured, single-cabinet USP100 and then upgrade nondisruptively all the way up to the flagship four-cabinet USP1100 model. The model numbers have little bearing on configurations, and disks, cache, ports and so forth can be added without taking the system offline. But whatever USP configuration is installed, the virtualized and single-pane-of-glass management capabilities are essentially identical.

Pricing varies considerably, given the broad range of possible configurations. HDS quoted sample prices of about $700,000 for a 6TB USP, up to $10 million for a fully loaded 330TB USP1100, HDS will also continue to sell and support its Lightning line of arrays.

Putting virtualization in primary storage is a bold move that may change the way enterprise-class storage is managed. Adding new, high-capacity storage to satisfy the high availability, high-performance requirements of enterprise applications and simultaneously being able to manage existing storage more effectively is no small matter.

Of course, some might say that casting your lot with USP' virtualization will lock your company into HDS. To a degree, that's true, but "buying any storage system today locks you in," says ESG's Asaro. You will be locked into HDS' virtualization solution--but not to a greater extent than with any other virtualization product. If HDS can deliver support for a wide enough assortment of heterogeneous storage products, USP's virtualization capabilities may, in fact, help avoid getting locked into any particular vendor's product line. With USP in place, you could buy and install whatever vendor's box suits current needs best, and let the USP system manage it.

HDS has wrapped an impressive set of features into the USP line and, in doing so, created a level of expectation that it will have to live up to. But Asaro underscores the importance of the TagmaStore line, calling it a "next-generation solution" with "overall capabilities [that] took me by surprise."

Article 3 of 20

Dig Deeper on Primary storage devices

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Storage

Access to all of our back issues View All