Modular storage systems are the antidote to high-priced monolithic arrays. In this Special Report you'll see what...
makes midrange so hot, where it's heading and what systems are leading the charge.
|Use caution when mixing Fibre Channel and SATA Drives|
Midrange arrays are changing the way companies manage their data and business. Gone are the days when pricey monolithic arrays ruled the data center. Cost is key. Companies no longer need to spend millions on a monolithic array for storage applications that a midrange array can handle for substantially less money.
As this Special Report reveals, midrange arrays offer a wide choice of features that can fulfill almost every storage need. Because the choices are so varied in the midrange array category, some homework is required to find the right array at the right price for specific storage requirements. Applications will dictate your requirements, but a midrange array can offer the best fit whether an application requires low-cost, high-capacity disk, the ability to mix and match different kinds of disk and RAID within an array, replication between monolithic and midrange arrays, or the ability to virtualize other storage arrays.
For starters, the type of midrange array hardware and software features that best serve a company's storage requirements must be determined. These features include:
- Type of disk drives
- Type and number of RAID controllers
- Amount of cache
- Number and type of front-end ports
- Software such as array management, volume management, snapshot and mirroring
Disk drive support
A compelling force behind the deployment of midrange arrays is their ability to support disk drives of almost any type, size or number. The lowest-priced configurations are approximately $5,000 per terabyte and show up in midrange arrays, such as EqualLogic Inc.'s PS200E and Isilon Systems Inc.'s IQ 2250, that support only SATA drives. Products like 3PAR Inc.'s InServ S400, IBM's TotalStorage DS6800 and Sun Microsystems Inc.'s StorEdge 6920 support only high-performance Fibre Channel (FC) drives, and are priced at around $50,000 per terabyte. Still other companies, like Hitachi Data Systems (HDS) Corp., offer arrays that support either SATA or FC drives.
The trend, however, is for midrange arrays to support both high-performance FC drives and high-capacity SATA drives behind the same front-end host interface. Vendors such as EMC Corp. and HDS--that initially supported only FC disk drives in their Clariion and Thunder series arrays, respectively--now include SATA support within these arrays and allow users to mix FC and SATA disk drives on the same system. 3PAR, IBM and Sun, who don't currently support SATA drives within their midrange systems, plan to add that support this year.
The ability to mix high-performance FC and high-capacity SATA drives within a midrange array gives administrators the flexibility to put the right data on the right kind of disk. For example, FC disk drives are designated for a high-performance database or file system apps, while the high-capacity, lower-cost SATA disk drives are used for apps calling for disk-to-disk backup, snapshots, virtual tape libraries, e-mail archives and fixed content.
Because different capacities and speeds exist for high-performance FC disk drives, users need to weigh several factors to come up with their best choice. IBM says that as a general rule, the smaller and faster the disk drive, the better the performance. HDS finds that 15,000 rpm FC disks will provide up to a 15% increase in performance over 10,000 rpm FC disks in random read environments.
One way to improve performance on slower disks is to distribute the data volumes across multiple disks. Chris Berthaut, the open-systems storage team manager with Hibernia National Bank in New Orleans, says his team uses this feature on the nine Xiotech Corp. Magnitude Classics they manage. "The virtualization [feature] was a big factor in the decision to buy Xiotech arrays since it allowed us to easily stripe data across disks on their arrays," says Berthaut.
A final area that may be overlooked is how the array recovers from failed disk drives and how easy it is to replace the faulty drives. Many midrange arrays, including HDS' Thunder 9520V, have a "call home" feature that reports a disk drive failure or a drive that's on the verge of failing. HDS reports that approximately 95% of the disk drives replaced on HDS' Thunder arrays "soft fail"--the disk exceeds an error threshold and is swapped out before the disk physically fails. Thunder copies the data from the poorly performing disk to a spare in the system. This approach minimizes performance degradation and speeds up recovery time since a copy operation runs faster than a RAID 5 rebuild.
|Monolithic vs. midrange: What's really different?|
The type of RAID controller will determine three major array functions:
- Disk drive management
- RAID levels
Storage Technology Corp. (StorageTek), for example, uses different controllers on various models depending on application requirements. If performance isn't a major concern, users should consider StorageTek's FlexLine FLA200 model that uses Fibre Channel arbitrated loop (FC-AL) controllers to connect to the disk drives. Conversely, if performance is the primary driver, the FlexLine FLA300 model enables a point-to-point or switched connection to back-end disk using a switched bunch of disk (SBOD) architecture.
The SBOD architecture also provides users with benefits beyond enhanced performance. The SBOD architecture in Hewlett-Packard (HP) Co.'s EVA5000 allows both of its controllers to connect to both ports on all of the disk drives for improved redundancy. It also provides fan-out and isolation between the controllers and disk drives, which makes fault isolation, repair and expansion easier. IBM's TotalStorage DS6800 takes SBOD even further, providing four data paths to every disk drive. The point-to-point connection also enables the array to identify when an individual disk drive starts to fail, something more difficult to do in an FC-AL implementation, and the failure of one RAID controller doesn't affect server and data availability.
The RAID controller also determines the RAID levels the array will support. With nearly every array on the market supporting RAID 1 and RAID 5 configurations, the importance of this feature comes into play for shops that need a specific RAID level to support a particular application. For instance, using RAID 10 in conjunction with a high-performance database should further enhance performance. Similarly, using either RAID 10 or 50 with SATA disk drives will improve performance and provide a higher level of protection in the event of a disk failure even though these configurations impose a large capacity usage penalty.
Vendors like Nexsan Technologies and Xiotech Corp., which support a large number of SATA configurations, are looking forward to the formal introduction of RAID 6 later this year. RAID 6 resembles RAID 5, but it uses two disks for parity. This new RAID configuration suits SATA disk drives particularly well because it allows two disks to fail without any data loss and incurs less of a capacity penalty than a mirrored disk configuration; it also provides a higher level of protection than RAID 5.
Cache and ports
There's a significant variance in the amount of cache in midrange arrays vs. their monolithic counterparts. While cache support varies from no cache on Xiotech Magnitude 3D systems to 80GB on a fully configured 3PAR InServ S800 Storage Server, the average cache amount on midrange arrays is 8GB vs. 64GB or greater on monolithic arrays.
Midrange arrays need less cache for two reasons: The I/O of apps running on Unix and Windows OSes tends to be more random than sequential, which generates more queries to disk. As a result, installing more cache in the midrange system generates only a marginal performance increase because the queries still need to go directly to disk.
The second reason for the reduction in cache is that the I/O block sizes generated by Unix and Windows applications tend to be either 4KB or 8KB. Unlike some monolithic arrays that carve out their cache sizes in 32KB blocks, midrange arrays break their cache into either 4KB or 8KB blocks. This allows the smaller cache sizes on midrange arrays to act as efficiently as the larger cache sizes on the monolithic arrays because all of the cache in each block of the midrange array is used.
The number of front-end FC connections supported by midrange arrays ranges from one to eight, although the majority of array vendors say four ports are sufficient for most applications. Assuming a 2Gb/sec FC connection, throughput only becomes an issue for the most performance-intensive apps or when a large number of servers (more than 10) access the same port on the array.
While having the option to mix-and-match disk types on the same array sounds appealing, storage admins need to be aware of some of the downsides of this approach. For instance, a batch job that archives old e-mails from FC to SATA disks may start at the same time that a highly visible production OLTP database needs to execute reads and writes to the disk. With the data potentially spread across multiple disks on different controllers and the applications sharing the storage processor and cache, contention for the same resources could arise. This creates an unpleasant situation in which the production OLTP application will slow down as both jobs contend for the same resources. It's also important to keep an eye on which servers are using which array ports, so backup jobs running at the same time don't overwhelm the same port with too much traffic.
Sun Microsystems Inc.'s StorEdge 6920 and other midrange arrays address these issues through logical partitioning (LPAR). LPAR lets storage admins carve up the storage array's memory and processing power and then assign it to specific servers. This way, even if an e-mail archiving batch job kicks off in the middle of the day, it can use only the memory and processing power allocated to it.
Hibernia National Bank minimizes contention issues by deploying different arrays. The bank's Windows/Novell group uses only Xiotech arrays for its file and print services, while the Unix group uses an IBM FAStT 900 (now the DS4500) that it finds is better suited for its applications' performance requirements. This approach also helped to isolate technical problems and alleviate political problems.
|IBM's "more choice" marketing plan|
Sophisticated volume management software on midrange arrays is approaching the functionality found with monolithic arrays. In addition, every midrange array comes with software that lets administrators monitor, analyze, manage or tune the performance of the array with varying levels of granularity. For example, the base module of StorageTek's SANtricity software suite lets users:
- Update controller firmware non-disruptively
- Migrate RAID levels dynamically
- Add and configure new drive modules
- Manage a system with mixed FC and SATA disks
- Monitor and tune performance
It's important to determine how many arrays a particular vendor's software will manage. Not having to learn how to use different management programs for all the different arrays in the data center saves considerable time. HDS, for example, extends the same level of software support it offers on its arrays to other vendors' midrange arrays by licensing and rebranding AppIQ Inc.'s StorageAuthority Suite as its HiCommand Storage Services Manager.
For any midrange array that will present virtual volumes or logical unit numbers (LUNs) to multiple servers on the same front-end FC port, volume management software is a must. While every array vendor offers this functionality in some capacity, there are differing degrees of management flexibility. The following volume management features should be considered essential:
- Virtualization--the ability to create virtual volumes out of the raw disk.
- Volume creation--control of how the virtual volumes are created, including their initial size and which raw disks are used in their creation.
- LUN security--the ability to control which servers can access a specific virtual volume.
- Volume groups--the ability to take existing virtual volumes, group them logically as one entity and present this new logical entity to the host as one large, logical virtual volume.
- Dynamic volume growth--the ability to dynamically increase the size of an existing virtual volume, whether it's a single virtual volume or a volume group.
The ability to group two or more volume groups and present them as one large volume makes the most sense for offloading volume management from the server to the array. Offloading the volume management to the array lets storage admins working with heterogeneous server OS environments learn only one volume management interface. It can also eliminate the need to buy third-party, server-level volume management software. Nearly every midrange array can accomplish the offloading of this task, but each array uses different methods; users need to be cautious about changing existing volume configurations.
For example, EMC's Navisphere Management Suite allows a user to create volume groups, or what EMC calls metaLUNs, on its Clariion array. These metaLUNs can be created in either a striped or concatenated format from existing LUNs. Each option presents benefits and drawbacks. The striped feature provides better performance because data is striped across all of the LUNs in the metaLUN, although all of the LUNs in a striped metaLUN must be of the same size, RAID level and disk type. A concatenated LUN must also be composed of disks of the same type (FC or SATA) and RAID level, but concatenation allows individual LUNs of different sizes to be joined.
In addition, the maximum size of a metaLUN is restricted by the size of the individual LUNs and the type of Clariion array. For example, metaLUNs on the CX600 can comprise up to 16 LUNs, while only eight LUNs can be used for metaLUNs on the CX400 and CX200 models.
Shops that opt to offload volume management from the server to the array will also need the ability to extend or grow these volume groups as they fill to capacity. Midrange arrays from 3PAR, HP, EMC and other vendors provide dynamic volume group growth, but each handles it differently. 3PAR allows administrators to grow a volume by increasing it to the precise size desired. HP's EVA can start a volume at any size, ranging from 1GB to 2TB, and then grow the volume in 1GB increments.
EMC permits the dynamic growth of metaLUNs, but the process differs depending on how the metaLUN was created. If a LUN is added to a metaLUN that was created in a striped manner, the Clariion will re-stripe the existing data across all of the LUNs now in the metaLUN. With a concatenated metaLUN, when a LUN is added to it, it gets appended to the end of the existing string of LUNs in the metaLUN with new data put on the new LUN in the group. With a concatenated metaLUN, data isn't automatically redistributed across this new configuration of LUNs.
The final step in growing virtual volumes is to configure the host server OS to discover the new size of the virtual volume. Some OSes such as Windows Server 2003 can do so dynamically, but users should exercise extreme caution by testing this functionality first and assuming any volume expansion will necessitate a reboot or, minimally, a rescan of the expanded volume to discover the additional capacity. Administrators should also check with the vendor to see how their OS handles the dynamic growth of volumes. Some array vendors report that data loss can occur if an OS doesn't recognize dynamic volume expansions.
Snaps and mirrors
Driven by SATA drives, shrinking backup windows and the need to create in-house disaster recovery procedures, array-based snapshot and mirroring are becoming common. Hibernia's Berthaut uses the synchronous mirroring capabilities of Xiotech's Magnitude Geo-Replication Services software between sites in New Orleans and Shreveport, LA, with a high degree of success.
Berthaut began using the synchronous mirroring feature as a stopgap measure because he was unsure how successful this approach would be due to the 700-mile roundtrip distance between the two sites.
"The Windows and Novell servers are extremely tolerant of the latency, but I'm still looking for a more acceptable asynchronous solution," says Berthaut. "I am currently evaluating Xiotech's TimeScale rapid restore appliance, an asynchronous mirroring product, and I plan to use it to replace the current synchronous mirroring process."
As midrange arrays take on more monolithic attributes, users should look to deploy midrange arrays for more mission-critical storage applications. They offer low-cost disk, high levels of performance and availability, easy to use software and effective replication technologies.
3PAR Inc., Fremont, CA, uses clustering in its InServ Storage Server to ensure availability, improve performance and provide scalability in its arrays. These arrays offer capacities that would dwarf the specs of most midrange systems. But 3PAR adds a twist to its clustering architecture by adding thin provisioning to control the allocation of capacity.
Two models of 3PAR's arrays are available: the S400 and S800. The S400 (prices start at $100,000) can be configured with a cluster of up to four controller nodes, while the S800 can support up to eight controllers. In both systems, the controllers are linked to each other and to disk modules by a full-mesh backplane. Each controller has a dedicated 1GB/sec connection, so adding more controllers scales up capacity and performance. As would be expected in a cluster arrangement, if one controller fails, another controller node takes over the failed controller's tasks.
Controllers connect to drive chassis modules. An InServ server can support a maximum 64-drive chassis with a total of 2,560 drives. Each InServ model offers broad capacity ranges, with both starting at a mere 600GB; top end for the S400 is 192TB, while the S800 can scale up to 364TB. The arrays can also be configured with substantial cache to improve performance--up to 40GB for the S400 (8GB for control and 32GB for data) and 80GB with the S800 (16GB control, 64GB data). InServ supports AIX, HP-UX, Linux, Windows, Novell NetWare and Sun Solaris.
The effectiveness of 3PAR's design has been confirmed with impressive performance numbers. On the Storage Performance Council's standard benchmark, the SPC-1 an eight-node configuration scored 100,000 IOPS.
But thin provisioning looms as the real star of 3PAR's highlight film, leveraging the InServ controller's ability to virtualize all of the managed storage into a common pool. With thin provisioning, you can allocate more storage than is physically present, so an application can be allocated as much storage as it's likely to need over the long term. But while the application may think it has a large amount of storage, actual storage is only pulled from the pool on an as-needed basis when the application writes to disk. This can help users to avoid overallocating physical resources and additional disk purchases can be delayed until they're needed. When additional disk is required, it can be added to the array non-disruptively, without having to take controllers or other storage offline.
When storage vendors talk about modular storage, the "modularity" of the system refers mainly to the ability to plug in more capacity. Given a controller, you simply add more disk drives or disk shelves to get more capacity. If you start to max out the performance of a given controller, however, it's time to buy another modular array.
"To me, true modularity means being able to start in the basement apartment and go all the way up to the penthouse," says Arun Taneja, founder of Taneja Group, Hopkinton, MA. "I need to be able to grow in terms of capacity and performance."
San Jose, CA-based BlueArc Corp.'s storage platform scales not only in terms of capacity, but via throughput and connectivity as well. As a high-performance NAS array, a single Titan SiliconServer comes with a baseline 5Gb/sec of throughput, but can scale to 20Gb/sec. Capacity-wise, the Titan supports a single file system of up to 256TB across four tiers of disk drives. Those tiers include two classes of enterprise-class disk, the 15,000 rpm and 10,000 rpm Fibre Channel (FC) disk drives, and two classes of SATA drives, the 7,200 rpm and 5,400 rpm models.
Furthermore, because of Titan's innovative blade-based design, it's relatively future-proof. For example, Titan currently connects to hosts using Gigabit Ethernet (GbE) blades. But when 10GbE becomes the norm, Titan users can simply swap out GbE blades for the 10 gig models. "We're always a blade away," says Mike Gustafson, BlueArc president.
One BlueArc innovation that pre-dates Titan is the implementation of its file system in silicon on a field-programmable gate array (FPGA). Prior to Titan, the FPGA-based file system was what allowed BlueArc to claim superior performance to other NAS arrays on the market. Now those same FPGAs have been ported to two separate blades.
Simply put, the BlueArc system is for users who need high-performance file serving, such as in the life sciences arena or for Internet service providers.
Need speed, but don't want to pay a Ferrari price? RIO Xtreme is the first product to result from Dot Hill Systems Corp.'s acquisition of Chaparral Networks last year, and is perfect for organizations that have high sequential data streaming needs such as audio/video editing or seismic processing.
The first RIO Xtreme model was the dual-controller C4200, with eight 2Gb/sec FC ports. Starting with a single 12-drive chassis, the C4200 offers 1.75TB in a 3U package that can be expanded to 28TB with the maximum 16 drive shelves. In January, the dual-controller C4400 was introduced with 16 host ports and a choice of FC, SCSI or SATA disk drives.
The C4400 model is equipped with two 2U, 12-drive JBODs, and delivers the performance of 780MB/sec from disk and even more if from cache. In contrast, a comparable SANnet II, Dot Hill's more mainstream FC array, delivers 380MB/sec, or less than half the streaming throughput. Furthermore, the C4400 can also be purchased with a separate Emulex InSpeed switched bunch of disks (SBODs) loop switch. An 11U configuration consisting of the controllers, two SBOD switches and four disk drive shelves produces 1,300MB/sec performance from disk.
Performance is RIO Xtreme's strong suit, but at a reasonable price. The RIO Xtreme FC JBODs with 3.5TB costs $82,512; with 9.6TB of SATA disks the list price is $88,588. The key to keeping costs down is Dot Hill's "switchless SAN" strategy, where you simply attach hosts directly to the storage array rather than to a switch, thereby eliminating the need for a costly switch infrastructure.
Another way of paring costs--without impacting streaming performance--is using lower-priced disk drives. While previous RIO Xtreme versions used only FC or SCSI drives, the latest model supports SATA. Omar Barraza, Dot Hill's director of marketing, anticipates the SATA drive option will be popular with customers who have largely sequential data access needs, as opposed to more write-intensive IOPS operations.
The SATA drive option also plays well to customers with large capacity needs. Using 400GB disk drives, the RIO Xtreme's 192 drives deliver 78TB of raw capacity.
The Clariion CX700 represents the top of EMC Corp.'s modular storage line. When pundits talk about modular storage encroaching on high-end enterprise storage, they're referring to products like the CX700 (starting price is $100,000), which is bundled with sophisticated replication and management software.
"The CX is the most successful product in this market segment," says Randy Kerns, senior partner, the Evaluator Group, Greenwood Village, CO. EMC packs the CX700 and smaller CX500 with a broad set of software, including the Navisphere Management Suite, SnapView, MirrorView, SAN Copy, PowerPath, VisualSAN and VisualSRM. "It has a comprehensive suite of data management software, extensive application and operating systems support, and good performance. It also supports FC and ATA drives within a single system," notes Tony Asaro, senior analyst, Enterprise Strategy Group (ESG), Milford, MA.
Kerns is impressed with the CX line's asynchronous and synchronous remote replication, particularly its support for consistency groups. This capability ensures write order is maintained when replicating data, a critical feature when replicating databases.
Currently, the CX700 offers both FC and ATA drives. Jay Krone, director of Clariion marketing, says the company is committed to following the ATA technology, which implies it will be moving to SATA. The company's design principle for Clariion dictates using only off-the-shelf components rather than custom-developed ASICs, which leaves the product a step behind the leading edge. The same design principle, however, also allows the company to drive down the price and ensure reliability.
Although EMC doesn't offer iSCSI for the CX700, it recently announced iSCSI for the lower end of the Clariion CX line (CX300 and CX500). Analysts have welcomed the announcement. "EMC supporting iSCSI is important because in many people's eyes, this legitimizes iSCSI," says Asaro.
iSCSI aside, Asaro complains that the company may be a little slow in moving to new technology. "The Clariion is a great product for the last generation of storage systems," he says. "ESG recommends that EMC add more advanced software, such as thin provisioning, snapshot copies and n-way scalability." Thin provisioning provides dynamic capacity allocation during write operations. N-way scalability requires the array to support more than two controllers. At this point, few modular arrays provide n-way scalability or thin provisioning, although that's coming, adds Kerns.
When EqualLogic Inc. introduced its PeerStorage arrays, the so-called PS line was recognized as a fast, simple way to implement a SAN. Based on iSCSI, the company boasted that users needed no special skills and the box could be set up without training in 20 minutes.
The PS100E, with dual controllers and 3.5TB of SATA drive capacity, costs approximately $40,000. Although typically equipped with 7,200 rpm SATA drives, a firmware enhancement now allows the PS line to support 10,000 rpm drives.
In addition to the hardware, the PS100E includes an extensive set of browser-based management capabilities, including point-in-time copies, volume cloning, self-healing disk technology that identifies failing disks and activates spares residing in the cabinet, multipath I/O and security (CHAP support). Built-in load balancing automatically distributes the workload across all disks as new capacity is added, even splitting previously existing volumes across the new disks. As a result, the PS line can scale simply by plugging in more arrays. The PS100E's management console, however, is proprietary and doesn't support CIM, which poses some challenges if you want to manage the storage environment with a different management tool.
"EqualLogic is a young company, but it's starting to get traction," says Dianne McAdam, senior analyst and partner at Data Mobility Group LLC, Nashua, NH. The product's appeal to this point has been to the small- and medium-sized business (SMB) market, based on its low cost and easy installation.
However, it's the scalability of the PS line that has attracted attention of late. "EqualLogic storage systems have a network clustered architecture that allows customers to add nodes to scale. By adding nodes, customers get additional processors, bandwidth, cache memory and capacity in a near linear fashion," says Tony Asaro, senior analyst at Enterprise Strategy Group (ESG), Milford, MA.
In a 2004 study, ESG found that a single PS array can support 50,000 IOPS out of cache. A 25-array system produced 1.2 million IOPS out of cache. Theoretically, a 32-array PS system should achieve 1.6 million IOPS out of cache, ESG calculated.
PS array pricing is competitive for its class: "iSCSI inherently reduces cost and complexity since expensive Fibre Channel gear is no longer required," says Asaro.
Hewlett-Packard (HP) Co.'s StorageWorks Enterprise Virtual Array (EVA) is in many ways a pioneering example of modular arrays. Originally introduced in 2001 by Compaq as VersaStor, the EVA is arguably still the only "virtual array" available from a major storage vendor. A virtual array, in HP parlance, goes beyond simply binding physical disks into a RAID group. Instead, the EVA presents all its capacity as a virtual pool of blocks from which users can carve up pools with different availability and performance requirements, while the EVA manages the underlying physical disk resources.
The two current EVA offerings are the EVA3000 and the EVA5000. The EVA3000 scales to 56 disk drives or a maximum of 16.8TB of raw capacity using 300GB FC disk drives. The EVA5000 supports up to 240 disk drives for a maximum of 70TB raw. EVA features include dynamic capacity expansion--which allows an administrator to allocate capacity in 1GB increments--virtual disk data load leveling as a non-disruptive background activity and distributed sparing of disk capacity.
Last year, HP added support for Fibre-Attached Technology Adapted (FATA) drives. The 250GB drives are based on low-cost desktop disks, but connect using FC and can plug into any existing EVA drive shelf without modification or bridging. Initially, about 10% of new EVA customers chose FATA disks; over time, HP expects FATA to account for about 25% of EVA drives.
Last month, HP announced a major enhancement to the EVA's controller-based replication software, HP StorageWorks Continuous Access EVA. Part of HP StorageWorks Business Continuance Software Solutions family, Continuous Access EVA now works with new Metrocluster and Continentalcluster software, which allows users to failover storage and servers across geographically dispersed locations.
The EVA line will be refreshed by midyear, according to Kyle Fitze, director of marketing for HP's online storage division, with a focus on scalability, performance, interconnect technology and additional disaster recovery capabilities, such as automatic failover across clusters of servers and storage. HP's long-term challenge will be defining what part EVA plays in its vision for grid-based storage. Certainly, the plan would allow using an existing EVA as part of the StorageWorks grid, Fitze says.
Last fall, Hitachi Data Systems (HDS) Corp. introduced the 9520V, an all-SATA version of its Thunder 9500 family geared toward small- to medium-sized businesses (SMBs). It can be bought with single or dual controllers, up to 2GB of cache and can support up to 59 250GB disks for a total of 13TB.
Despite being a SATA box, the 9520V has much in common with the rest of the Thunder family, the 9585V and 9570V. Virtual ports allow users to connect many more hosts to the array than just through its physical ports (four front-end 2Gb/sec ports, with subscription of around 2.5 hosts per port). The 9520V also supports almost the entire library of Thunder software--an important consideration for current HDS customers.
Missing from the 9520V's software arsenal is TrueCopy Remote Replication, a real-time synchronous data replication function that HDS didn't port because it's "cache-intensive," says Jeff Hill, HDS' director of infrastructure product marketing. The 9520V is limited to only 2GB total cache. TrueCopy is also cash-intensive--pun intended--and not typically needed by SMB customers, Hill adds. A dual-controller starter configuration with 3.4TB of raw capacity and resource management software costs between $22,000 and $24,000.
HDS has gone to great lengths to make sure the 9520V is no less reliable than its other models, despite its use of SATA drives. Unlike EMC's AX100, a competitive product, it can be equipped with dual controllers--a high-availability feature. Furthermore, HDS has done extensive work to enhance the reliability and lifespan of SATA drives by adding features such as verify-on-write. After periods of extended inactivity, disk drive heads are lifted or parked to prolong disk drive life.
Just because the 9520V has been dubbed an SMB product doesn't mean that's where it will always end up. "We expect to sell some units [into enterprise data centers] as part of a tiered storage infrastructure," says Karen Sigman, HDS' vice president of global channels. To that end, it's important to note that the 9520V is one of the arrays supported by the HDS TagmaStore Universal Storage Platform, which can virtualize certain storage assets in its domain.
When IBM Corp. introduced the DS6000 last fall, it had to share the stage with its more muscular sibling, the high-end DS8000. And while it was expected that the DS8000, with its partitioning capabilities and enterprise-class capacity and performance, would garner most of the kudos, it was the DS6000's exceptional modularity that grabbed the spotlight.
The DS6000 effectively stretches the concept of a midrange storage system downward and upward. In its most modest configuration, it houses a mere 292GB of disk space, but can grow up to 74TB, treading into enterprise territory.
Craig Butler, IBM brand manager for midrange storage products, insists the DS6000 is at the bottom of the enterprise class, while IBM's older DS4500--part of the FAStT product line--is at the top of the midrange class (see IBM's "more choice" marketing plan).
In building the DS6000, IBM borrowed heavily from its server technologies, most notably incorporating the same 64-bit PowerPC processors used in its eSeries servers. The result is a base system that will fit in a single 3U slot in an industry standard 19-inch rack. Expansion units--each housing up to 16 additional drives--are similarly sized and a fully tricked-out rack can accommodate a maximum of 248 drives.
IBM also says the DS6000 can be installed in about an hour without onsite service personnel. Upgrades and maintenance can be handled by users with the aid of diagnostic and self-healing features that have been road-tested in IBM servers. Upgraded management software, now called IBM TotalStorage DS Storage Manager, is enhanced with improved interfaces and wizards to help users take care of configuration, operations and maintenance.
The DS6000's operating system code is nearly identical to that of the DS8000, which runs about three-quarters of the ESS series' code. This ensures operational and functional compatibility among DS and ESS machines, allowing users to run nearly all the same software across systems. While this is an important development technically, IBM's allowing the DS6000 to run the same applications as its enterprise-class sibling dramatically ups the ante for all midrange arrays regarding more functionality at a lower cost. In addition, the DS6000 supports mainframes and open-systems servers, which is rare among midrange storage systems.
The DS6000 is one of the most expandable, modular storage arrays available. All that's missing is a track record. The DS6000 is still too new to rate its success against other vendors' offerings and IBM's own popular FAStT arrays, which it's positioned to replace.
Seattle-based Isilon Systems Inc.'s IQ modular arrays are specially designed to store large image files. Isilon uses a symmetrical clustering architecture where all nodes act as peers to deliver a modular approach to capacity and performance scaling.
The Isilon IQ comes in two models--the IQ 1440 and the IQ 2250--differentiated only by their storage capacities. The basic building blocks of both models are 2U rack-mountable nodes with their own Intel Xeon 2.8GHz processor, 4GB of cache, disks and four GbE NICs. The two models have an initial configuration consisting of three nodes; both are expandable to 21 nodes, which fill a single rack. In this manner, the IQ 1440 scales from a base system of 4.3TB to 30.2TB, while the higher-end 2250 goes from 6.75TB to a maximum capacity of 47.3TB.
Because each node is essentially a fully configured storage system, performance increases almost linearly as nodes are added to the cluster. Adding a node is a simple affair, requiring little more than plugging in power and the Ethernet connection. In its report on the Isilon IQ, the Milford, MA-based Enterprise Strategy Group noted that adding a new node "takes less time than setting up a DVD player."
As a new node is added, the cluster members automatically acknowledge each other and begin working as a unit. When a node enters the cluster, it inherits existing configuration information and policies. Isilon's AutoBalance feature then redistributes stored data among all active nodes. All storage is pooled and managed by a single file system, the OneFS 3.0 distributed file system.
The IQ's ease of use and scalability, combined with a starting price of less than $50,000 for a three-node cluster, makes it an attractive option. Some users might balk at committing to a storage system based on ATA disks, but Isilon's data protection schemes seem to mitigate some of that risk. In any event, the company says the next generation of the IQ will support SATA disks. Perhaps then Isilon will boost the maximum capacity of the IQ system to the levels offered by other midrange clustered systems, especially if it wants to be a serious player in the rapidly growing digital imaging content field.
Network Appliance (NetApp) Inc.'s FAS270, which starts at $27,500 for a single enclosure with 14 drives, represents the high end of NetApp's entry-level FAS200 product lineup. As such, NetApp refers to it as a midrange machine, mainly citing its clustering option. With a limit of 4TB per enclosure, it clearly falls into the low end of what most consider midrange storage. To move beyond 6TB, you'll need to build out additional disk enclosures.
Versatility and flexibility rather than scalability are the real strengths of the FAS270. It provides concurrent file access--like a NAS box--and block storage access as would be required by a database. It can also handle simultaneous access via iSCSI and FC, effectively combining NFS, CIFS, FC and iSCSI under the same management interface.
NetApp offers "a wide range of software providing snapshot copies, remote mirroring, write once, read many [WORM] capability and more," says Tony Asaro, senior analyst, Enterprise Strategy Group. Asaro is also impressed with three new capabilities: thin provisioning, which can be used to maximize existing storage capacity and simplify provisioning; FlexClone, which makes replications of data volumes that are readable and writable from different snapshots, effectively reducing the amount of storage needed for copies; and storage virtualization, which allows FAS to create a single pool of managed storage from heterogeneous storage systems.
The FAS270 uses NetApp's Data Ontap, a proprietary storage operating system. Unlike many of the major storage vendors' proprietary operating systems, Data Ontap runs across the entire NetApp product line. Although Data Ontap began as a file-sharing system, it has been enhanced to handle block-oriented FC and iSCSI protocols. NetApp uses the same management and GUI across all of its systems.
Some analysts have complained that any storage operating system that handles file and block protocols invariably suffers from lower performance due to the extra overhead. Asaro discounts those complaints. And given the entry-level and low midrange aspirations of the FAS270, whatever small performance degradation occurs would have minimal impact. This product wasn't designed for high-volume transaction processing. Rather, it's more likely to be found in a remote branch office where performance requirements are lower.
Asaro gives the FAS270 an excellent rating as a midrange solution, but has some concerns. "It doesn't scale to the highest levels. On the NAS side, NetApp has limited file system size, which creates a management challenge with customers that have a large number of FAS systems," he says.
Silicon Graphics Inc. (SGI), Mountain View, CA, introduced the InfiniteStorage TP9500 at the Super Computing Show in November 2002. It's bundled with SGI's software for applications that demand high performance. SGI sells the array in three different configurations: the SGI InfiniteStorage SAN 3000, the InfiniteStorage NAS 3000 and a data lifecycle management offering.
The SAN 3000 sports two meta data controllers, SGI's CXFS SAN file system, a Brocade 3800 FC switch with the option of a duplicate for failover, dual HBAs and multiple RAID options. CXFS is a high-performance 64-bit file system that supports file sizes up to 9 million terabytes and file systems to 18 million terabytes. For heterogeneous SAN support, SGI resells AppIQ's StorageAuthority management software.
The NAS 3000 offers up to eight GbE ports, and instead of CXFS it uses SGI's standard 64-bit file system, XFS, which supports NFS and CIFS. The SAN and NAS bundles include SGI's cluster configuration, snapshot and remote mirroring software.
Craig Schultz, SGI storage product manager, says XFS supports NAS and SAN. "[A user] would simply add a Fibre Channel [FC] HBA for SAN and NAS connectivity," he says. Other NAS products tend to have their own file system and are difficult to integrate into a SAN, usually resulting in separate management. Likewise, users can go in the opposite direction and add CIFS and NFS support to CXFS. "The two file systems are interchangeable and work together," says Schultz.
A third bundle includes the SAN or NAS option with SGI's DLM Server, which includes its Data Migration Facility software layered on top of the file system. This product migrates files from online storage to nearline storage based on user-defined criteria such as time of last access or file size. The downside is that it can only be hosted on SGI IRIX or Linux servers.
The TP9500 disk enclosure can house 2TB of storage in three standard EIA units (5.25 inches), 20TB in a standard rack and 32TB in a single system. The 5884 controller, manufactured by Engenio Information Technologies Inc., is a dual-controller module that has eight 2Gb/sec FC ports and 800MB/sec of host and drive-side bandwidth, 2GB of dedicated cache, parity RAID and a specialized I/O control processor that focuses on data movement. SGI sells the TP9500 with 2TB of FC disk for a list price of $100,000, or $150,000 for 8TB of FC disk.
To appreciate Storage Technology (StorageTek) Corp.'s StorageTek FlexLine family of products, you need to sort through the genealogy of several companies as well as the technology. The FLA200 series, shipping now, and the FLA300, which will be released in April, are rebranded versions of StorageTek's D series. The underlying technology is from LSI Logic, now known as Engenio Information Technologies Inc.
The technology is being refreshed with new chipsets, enhanced software and more redundancy. In addition, StorageTek has refined the product to integrate with its tape devices and mainframe storage as part of a complete information lifecycle management (ILM) offering.
Reaction from the analyst community has been generally good. "We see the new FlexLine 200 and 300 series as having the potential to make StorageTek a player," says Dave Reine, director of enterprise systems at The Clipper Group Inc., Wellesley, MA. "They're going with a common architecture using SATA and Fibre Channel in the same box. You can start inexpensively and grow as you need. In short, it is an ideal approach for an ILM strategy."
The Enterprise Strategy Group (ESG), Milford, MA, also welcomes what amounts to a revival of the StorageTek disk storage line. "The StorageTek FlexLine, like the [EMC] Clariion, is customer-proven in a large number of companies, applications and environments. It has an extensive set of software features, is easy to use and provides very competitive performance," says Tony Asaro, an ESG senior analyst.
The FLA200 and the upcoming FLA300 offer 2Gb/sec FC switched, access-centric disk arrays. The FLA200 line starts with 14 SATA disk drives and scales to a total of 112 SATA drives. It offers a maximum sustained rate of 53,200 IOPS (FC) and 6,000 IOPS (SATA). FLA200 series prices start at approximately $10,000.
The arrays include StorageTek's SANtricity software, which lets users mix drive types in the same system and perform asynchronous, remote disk-to-disk mirroring.
The product line has been revamped, but still lacks thin provisioning and iSCSI connectivity. It could also benefit from larger cache memory, notes Asaro. StorageTek is partnering with BlueArc Corp. and ONStor Inc. to integrate a NAS head for file storage.
The StorEdge 6920 is Sun Microsystems Inc.'s first array with virtualization capabilities based on the technology it acquired with its 2002 purchase of Pirus Networks. The 6920 enables the pooling of storage for internal hard drives and external arrays, with an abstraction layer that separates data services like replication and snapshot from the hardware.
"Abstracting the hardware from the data services allows users to put whatever hardware they want on the back end and gives enough firepower to run high-performance applications," says Chris Wood, director of technical sales and marketing for Sun's Global Network Storage Division.
But today, the 6920 works only with Sun's StorEdge 6120s and 6130s. It will eventually support all Sun arrays and third-party products, Wood says. The 6920 starts at 4TB, scales to 65TB and provides up to 28 2Gb/sec FC ports. It can create pools of storage for up to 14 different application profiles, with I/O and caching tuned to each application. The ability to define storage partitions is useful for quality of service and SLA management. Pricing ranges from $190,000 for 2TB to $421,000 for 8TB.
Sun is expected to announce version two of the 6920 in May, which will feature synchronous and asynchronous long-distance data replication, and more refined snapshot management capabilities. Support for consistency groups is also expected, which is important for database applications.
The 6920 supports a scale-out architecture, so controllers and capacity can be added while maintaining a single image of the storage system. "This simplifies the administration of the product, unlike other products that force users to add additional complete systems that must be managed separately," says Randy Kerns, senior partner at the Evaluator Group, Greenwood Village, CO.
The 6920 currently supports only 1,024 LUNs, but is expected to jump to 16,000 in May. It also lacks a clone copy option for point-in-time copies, which lets database users run analyses against a cloned copy while production data stays online.
Clustering is a storage technique designed to offer more flexibility, scalability and security. The distributed clustering architecture of Xiotech Corp.'s Magnitude 3D system can grow non-disruptively to meet increasing requirements.
In Xiotech's design, the N-way Dimensional Storage Cluster architecture, storage controllers are loosely coupled to each other to provide failover and data migration services. Each controller node attaches to all installed drive bays over dual Fibre Channel (FC) connections and also manages its own storage functions.
The Magnitude 3D's drive bays can accommodate from two to 14 drives each (12 for SATA), and the drives can vary in capacity and performance. A range of FC disks up to 300GB are supported, as well as 400GB SATA disks. Xiotech offers two classes of FC drives--running at 10,000 rpm and 15,000 rpm, respectively, in addition to the SATA offerings.
However, says Rob Peglar, VP of technical solutions and chief technologist at Xiotech, there's no tiering of controller resources per se in the Magnitude 3D distributed cluster architecture. This is because any controller in the cluster can assume the duties of another at any time without application interruption or downtime. A virtual port may be assigned to a specific controller, which lets an admin control the tier of one server's I/O stream relative to another server's I/O stream.
There's also plenty of built-in redundancy protection. If one Magnitude 3D controller should fail or need to be taken offline for maintenance, the remaining active controllers can assume the operations of the inactive unit.
A wide variety of server operating systems are supported, including Windows, Linux, Novell NetWare and a number of Unix variations, including HP-UX, IBM AIX, Sun Solaris and SGI IRIX. Xiotech also offers the Magnitude 3D RAC Pack, a version that's specifically configured to integrate with Oracle's RAC. Xiotech recently announced a NAS version of the Magnitude 3D based on Microsoft Windows Storage Server 2003.
The Magnitude 3D is managed by Xiotech's Intelligent Control (ICON) application, a browser-based program that runs out-of-band on an Ethernet-based management path. A single installation of ICON can manage up to six clusters.
In its current incarnation, Magnitude 3D is limited to four controllers, but the plan is to ultimately support 16. In its evaluation of the product, the Enterprise Strategy Group, Milford, MA, notes that "as Magnitude 3D supports more nodes it will become extremely scalable."
Last month, Xiotech unveiled an entry-level model, the Magnitude 3D 1000e. Intended for small enterprises and first-time SAN implementers, the 1000e is a dual-controller array with a starting price of less than $50,000.
|About the author:
Jerome M. Wendt (firstname.lastname@example.org) is a storage analyst specializing in the fields of open-systems storage and SANs. He has managed storage for small- and large-sized organizations in this capacity.
Alex Barrett, Rich Castagna, Jo Maitland and Alan Radding also contributed to this Special Report.
Analysts and vendors say there will be a steady stream of technical enhancements to midrange arrays this year, such as a greater use of ATA disks, a shift to lower-priced SATA drives, the adoption of SAS drives as they become available, and the combination of different disk drives such as Fibre Channel, SCSI, ATA/SATA and serial-attached SCSI (SAS) in the same cabinet. In addition, a few vendors have started to add iSCSI, thin provisioning (dynamic capacity allocation during write operations) and n-way scalability, which requires support for more than two controllers.
High performance will increasingly become part of the midrange storage environment. "We're just beginning to see 10,000 rpm SATA drives," notes Brian Garrett, technical director at the Enterprise Strategy Group's lab in Milford, MA. With 10,000 rpm SATA drives, midrange storage can provide sufficient performance for large database transaction applications.
Looking out further, 2.5-inch disk drives will also appear in midrange storage. "You'll see them both in blade servers and for regular storage arrays," says Garrett. But don't expect a large number of 2.5-inch drives in midrange arrays any time soon. "It will take several years for the costs to fall in line," advises Jay Krone, EMC Corp.'s director of Clariion marketing.
Storage vendors will continue to use software to differentiate their midrange and enterprise systems. But software differences are narrowing fast, as most midrange storage systems already have replication, snapshot and sophisticated management capabilities. The only debate is whether these capabilities rival those of enterprise systems. "For now, there are still things an EMC Symmetrix can do that an EMC Clariion can't," insists Mike Karp, senior analyst, Enterprise Management Associates, Boulder, CO.
"We have replication on the Clariion and the Symmetrix, but the software is more mature, more robust on the Symmetrix," notes Krone, who adds that "the Symmetrix can handle more clients--thousands rather than hundreds--and support more simultaneous applications--dozens rather than just a couple." The Symmetrix also has greater depth of redundancy, enabling it to absorb multiple component failures at different points simultaneously. Mainframe connectivity is another feature that distinguishes monolithic from most midrange arrays.
Software differences, however, are being further obscured with the introduction of IBM Corp.'s DS6000 array, a modular storage system that bridges midrange and enterprise storage. "The DS6000 has the same microcode as Shark [the former code name for IBM's high-end Enterprise Storage System (ESS) line]," says Cindy Grossman, director of disk marketing for IBM Storage Systems. By running the same microcode, the DS6000 can use the same advanced software IBM provides for its ESS products. "Now modular can equal enterprise," she says.
This may impact the entire storage industry. "That IBM ported high-end enterprise code to a lower cost modular platform is significant. Users don't want to manage different systems," says Rob Schafer, senior program director at the Meta Group, Stamford, CT. Other vendors may be forced to follow, further erasing operational distinctions between modular and enterprise systems while retaining lower midrange pricing.
Pricing continues to be a key differentiator for midrange storage. Garrett sees the adoption of SATA drives leading to price reductions of as much as 25% for modular storage.
Despite increased disk capacity, faster disks, higher performance, increased flexibility and sophisticated software capabilities, midrange storage may still not be ready to replace enterprise storage for all applications. "Modular storage will take over at the enterprise level when the services are as capable as those at the high end," says Karp. With high-end microcode now available on modular storage, that day may come sooner rather than later.
|About the author:
Alan Radding is a frequent contributor to Storage magazine.