All are built around the U-Star architecture--previously known as Hi-Star--the key to HDS' claims of vastly improved performance. Like earlier incarnations of HDS' crossbar switch architecture, the U-Star architecture connects front-end ports to back-end controllers and disks to cache. This iteration includes upgraded processors, and features twice the number of data paths compared to the current Hi-Star implementation in HDS' Lightning systems; U-Star can also handle four times the number of concurrent cache operations. HDS says these improvements represent an expansion of the existing architecture, which has the capacity to be enhanced even further. (To see some of the key metrics comparing the performance of the USP1100 to the Lightning 9980, see sidebar.)
The total bandwidth refers to both the cached bandwidth and control bandwidth. For example, the USP1100's 81GB/sec total bandwidth includes 68GB/sec dedicated to data, with the remainder--13GB/sec--dedicated to control functions. Total bandwidth ratings for the USP100 and USP600 models are 23.5GB/sec and 40.5GB/sec, which also represent substantial improvements over the Lightning 9980V. HDS says all of this adds up to approximately four times the performance of comparable Lightning models. "It's about 4X in everything," says Mikkelsen. "It's going to scale pretty high."
To provide some additional context for these figures, EMC's DMX 3000, with its Direct Matrix Architecture, boasts 128 direct data paths, 32 of which can operate concurrently at 500MB/sec each, yielding a total throughput of 16GB/sec.
All of the USP1100 specs add up to performance that HDS says tops out at almost 2 million IOPS. If that figure bears the scrutiny of real-world testing of production units, it would give the USP1100 a substantial advantage--at least on paper--over the competition. But array performance statistics can be misleading, and will vary significantly, depending on testing environments. Testing with standard benchmarks such as those provided by the Storage Performance Council (SPC), might yield comparable performance numbers, but some vendors, such as EMC, keep performance numbers under wraps. Still, even under controlled conditions, comparing storage arrays' performance is a dicey affair.
Two million IOPS may, indeed, push the bounds of credibility, and HDS concedes that it's a theoretical limit. But the company says it isn't alone in the industry when it comes to publishing theoretical performance numbers. "I think all the storage vendors post those," says Mikkelsen, adding that practical limitations depend on many variables in specific storage environments and a realistic limit is "all over the map."
Storage pools
The virtualization part of the TagmaStore story is just as compelling, and in the future, it may be even more important than the spec sheets for the USP boxes. Any of the three models can virtualize up to 32PB of external storage. While 32PB may seem like a wildly outlandish number, HDS insists that's it's within the realm of reality. But with support for about 16,000 addresses, Mikkelsen says, "You'll run out of addressing capability before you run up against the 32PB limitation," as 32PB is a lot of storage to virtualize. IBM's SAN Volume Controller (SVC) offers similar virtualization features, and currently supports storage from other vendors, something HDS says the TagmaStore boxes will do in subsequent releases. SVC running in a Cisco MDS 9000 series director has an upward limit of 2PB of storage that it can pool, but in a recent webcast, a Cisco spokesman said that the 2PB is more a theoretical, than practical, limit.
In HDS' virtualization scheme, all back-end--or virtualized--storage would be represented as USP LUNs, and all connected storage would be managed from a single HiCommand console.
While the USP systems can pick up configuration and volume information from external storage, HDS says the process will be more of a volume migration between the external storage and USP platforms.
For more efficient administration of large installations, up to 32 Virtual Private Storage Machines can be created to carve out more manageable chunks of the pooled storage to allow multiple system administrators to oversee segments of the entire pool. Overall administration can still be centralized, but each administrator of a virtual machine would be able to manage physical disk space, cache and front-end ports. This capability may be especially useful when consolidating storage among divisions or of acquired companies.
"Having external virtualization provided by a storage system allows for another tier of storage, and a system that can act as storage and a virtualization platform allows for even greater consolidation," says ESG's Asaro.
Because all of the back-end storage connects to the USP system, the latest HDS software tools would be available to manage that external storage, effectively upgrading the management capabilities of the attached boxes. All writes--whether directed at the USP storage or external storage--are cached in the USP array. Because the USP system will likely be the best performer among the pooled systems in this scenario, HDS says that write-access performance to the external storage should actually improve, especially for SATA/ATA disks, which could approach the performance of Fibre Channel (FC) drives. "If you're using [SATA drives] for lifecycle management, probably 98% of the I/O that goes to the SATA drives are writes, and writes go to the [USP] cache," says Mikkelsen. If the back-end box already uses write caching, it will work in conjunction with the USP system.
As more external storage is added to the USP's virtual pool, the USP's performance will be affected. The type of storage will have a bearing on performance as well. "If you start putting a lot of active storage out there, let's say you backend it with 9980Vs or DMXs," says Mikkelsen, "that's going to be eating into the bandwidth of the [USP]."
An upcoming enhancement to HDS' Data Retention Utility, formerly called LDEV, will make it possible to lock down retention data without having to format the disk it resides on as write once, ready many (WORM). Data Retention Utility makes WORM an attribute of the data, so the retention policies set for the data will follow it wherever it's moved within the USP-managed pool of storage.
Reads to all connected storage are cached, using read-ahead technology to enhance performance throughout the pool. While initial read requests have to wait for disk access, read-ahead anticipates additional requests and accesses appropriate blocks of data which can then be accessed at memory speed. With this method, there may be a performance improvement with connected SATA/ATA disks; for FC disks, HDS expects any performance hit to be minimal at best. Read-ahead technology is commonly used in many vendors' storage arrays.
The TagmaStore arrays also add cache partitioning, which was not a feature of the Lightning line. Partitioning cache helps ensure that unruly applications don't overrun cache allocated to other users and adversely affect their performance. HDS says that cache partitioning will make it possible to guarantee quality of service throughout that cache.
Replication enables tiering
By pooling different storage resources, USP's virtualization capabilities will help make effective use of older and lower performing storage in a tiered storage structure. Tiering the pooled storage will make it possible to implement data lifecycle management, and enhance regulatory compliance efforts. With support for a variety of open systems and mainframe operating systems, it would be possible, for instance, to give mainframe computers access to lower cost SATA arrays.
United's Pilafas is paying particular attention to TagmaStore's virtualization and replication functionality to enhance his company's storage tiering. "Information lifecycle management is something we need and would like to do," says Pilafas. "It's something we'd like to do based on the capabilities of the new box."
The key to data movement in the USP virtualization architecture is HDS' new Universal Replication Engine, which will be in place with the rollout of additional software components, planned for December of this year. This will provide a single replication application that works with the USP system and across all connected storage. Data can be moved internally within a USP box, from a USP system's internal storage to an external storage device or among external devices. And again, because replication is controlled at the USP level, there's a single interface to manage data movement among disparate devices.
According to HDS, the replication engine offloads the burden of typically resource-intensive asynchronous replication to lessen the impact on hosts and network resources. HDS cites some specific techniques they employ for reducing the performance hit.
Replication services on the USP systems are journal- based. The journal resides on disk, rather than in cache, using a two-level striped architecture that HDS says will provide performance that's nearly as fast as cache. The journal helps avoid overtaxing primary storage and other network components and provides assurance that updates will be properly handled. For example, if bandwidth is exceeded during the replication process, the updates will continue to be written to the journal until enough bandwidth is once again available. HDS claims that with the journaling method, the replication engine can withstand long link outages, and it's more effective than the more common approach of using a bitmap to keep a record of the data tracks that need to be updated.
Most replication products use a "push" technology, which puts the burden of the replication process on the primary array. The Universal Replicator Engine, like IBM's XRC, uses pull technology to offload that burden. With the journal residing on the primary array and reader tasks on the secondary storage handling the changed data, performance degradation on the primary is substantially reduced. Bandwidth, too, is conserved. Typically, bandwidth for replication is allocated based on peak I/O activity. But because the journal will absorb the peaks, HDS suggests that required bandwidth can be reduced by as much as half.
The same journaling process can help diminish geographic issues for long-distance replication from one USP box to another, which would make it effective for disaster recovery. Using USP systems at separate sites makes it possible to provide disaster recovery for multiple data centers.
New model for enterprise storage
The USP line offers an easier upgrade path than previous HDS storage products, HDS claims. Users may opt to start at the low end of the line with a minimally configured, single-cabinet USP100 and then upgrade nondisruptively all the way up to the flagship four-cabinet USP1100 model. The model numbers have little bearing on configurations, and disks, cache, ports and so forth can be added without taking the system offline. But whatever USP configuration is installed, the virtualized and single-pane-of-glass management capabilities are essentially identical.
Pricing varies considerably, given the broad range of possible configurations. HDS quoted sample prices of about $700,000 for a 6TB USP, up to $10 million for a fully loaded 330TB USP1100, HDS will also continue to sell and support its Lightning line of arrays.
Putting virtualization in primary storage is a bold move that may change the way enterprise-class storage is managed. Adding new, high-capacity storage to satisfy the high availability, high-performance requirements of enterprise applications and simultaneously being able to manage existing storage more effectively is no small matter.
Of course, some might say that casting your lot with USP' virtualization will lock your company into HDS. To a degree, that's true, but "buying any storage system today locks you in," says ESG's Asaro. You will be locked into HDS' virtualization solution--but not to a greater extent than with any other virtualization product. If HDS can deliver support for a wide enough assortment of heterogeneous storage products, USP's virtualization capabilities may, in fact, help avoid getting locked into any particular vendor's product line. With USP in place, you could buy and install whatever vendor's box suits current needs best, and let the USP system manage it.
HDS has wrapped an impressive set of features into the USP line and, in doing so, created a level of expectation that it will have to live up to. But Asaro underscores the importance of the TagmaStore line, calling it a "next-generation solution" with "overall capabilities [that] took me by surprise."