Modular arrays earn new trust

Modular arrays have come a long way recently, but are you ready to risk all of your company's mission-critical data on them?

The bid sheet for the modular arrays slides across the conference room table, prompting eyebrows to rise. The storage manager looks the vendor squarely in the eye and says, "The price looks good, but here's the bottom line: We're betting our business on your modular arrays, so we've got to know, can we trust them?"

Increasingly, the answer is "yes." Modular arrays have come a long way in recent years. Analyst firms such as Gartner Inc., Stamford, CT, and The Evaluator Group, Greenwood Village, CO, now pin the enterprise-class label on modular arrays. Vendors promise the same or greater levels of availability, capacity, management and performance for their modular arrays as found on their monolithic arrays.

Toss in faster deployment, smaller footprints and lower price points and the decision would almost seem like a no-brainer, except for the critical environments into which users may look to deploy these arrays.

If users are to feel confident about moving their most critical data onto modular arrays in whatever environment they deploy them, they need to know the pros and cons of such a tactic. Management, security, performance and support along with price all factor into the equation. So let's look at the distinctions between modular and monolithic arrays, what features modular arrays now offer, how users should look to use them and what trade-offs, if any, users will encounter if they decide to go modular.

Distinctions of modular arrays
Modular arrays differ from monolithic arrays in four distinct ways:

  1. The flexibility to add disks
  2. Network connectivity
  3. How cache is managed on the controllers
  4. Support for different drive technologies
According to Ken Steinhardt, EMC Corp.'s director of technology analysis, "On modular arrays, you can add individual disk or processor enclosures on the fly, while monolithic arrays come with preconfigured slots for disks and processors."

All monolithic arrays offer either ESCON or FICON in addition to Fibre Channel (FC) or SCSI. All modular arrays--with the exception of EMC's DMX 800--only offer FC, SCSI and iSCSI as connectivity options.

Randy Kerns, partner at The Evaluator Group, says monolithic and modular arrays also handle caching differently. Monolithic arrays hearken back to their mainframe roots and keep all read and write cache as a single image; all reads and writes must be mirrored between the caches on the separate controllers before a response is sent back to the server. Modular arrays maintain writes as a single image before sending a response back to the server; their read cache is specific to each controller. Each controller handles reads independently. Responses are potentially faster because there's no requirement to mirror the read cache.

The final variation comes in the type of internal disk drives supported by the storage arrays. Monolithic arrays exclusively support either internal FC or SCSI disk drives. Modular arrays support those as well, but some vendors have started to add Serial ATA (SATA) or ATA drives to their stable of drive offerings to handle the growing need to store reference and backup data.

Already, products such as EMC's Clariion and Centera lines, EqualLogic's PeerStorage Array 100E, IBM's FAStT900 and Winchester Systems' AT-1200 support this sort of functionality. Hitachi Data Systems (HDS) is scheduled to ship these drives in the first quarter of 2004 for its 9500 line of modular arrays. While more storage vendors now offer either ATA and SATA drives along with SCSI or FC drives in their line of monolithic arrays, the ability to mix and match them in the same array still only shows up on the drawing boards.

Other than these specific differences, Kerns says, "When looking at functionality, reliability and performance, modular storage arrays are now virtually indistinguishable from monolithic storage arrays. In fact, they are just as reliable, easier to manage and typically outperform the cache centric boxes for the number of controllers they offer."

Users can now confidently deploy modular arrays in production for essentially any type of application previously reserved only for monolithic arrays. Both hardware and software features once only found in monolithic arrays now reside in modular arrays.

On the hardware side, modular arrays come with redundant controllers, increasing amounts of cache, multiple FC ports and high disk capacity. HDS' Thunder 9580V and EMC's DMX 800 both illustrate these sorts of advances. The 9580V provides two controllers, up to 8GB of cached memory, eight front-end FC ports and 64TB of raw capacity. Similarly, EMC's DMX 800 provides up to 32GB of global memory, 16 front-end FC ports and 17.5TB of raw capacity, while being one of the few storage array vendors to offer iSCSI capabilities. (See "Select modular storage arrays")

On the software side, users should look for features such as point-in-time copy, remote copy, LUN masking and easy-to-use management consoles. Both IBM's FAStT900 and FAStT700 storage servers offer Remote Copy, FlashCopy and VolumeCopy as optional features. Network Appliance Inc.'s (NetApp) FAS200 has SnapMirror, SnapRestore and SyncMirror options. You can create point-in-time copies at either local or remote sites for backup and recovery or development testing with copies of production data.

How a monolithic array may price out
$450,000: HDS' 9970V includes hardware features of 3TB of raw storage, FC HDDs, two controllers, eight FC host ports, 8GB cache and RAID 1 and 5. Software features such as LUN masking and array management with remote copy, D-2-D backup and snapshot may cost extra.
$750,000: HDS' 9980V include hardware features of 6TB of raw storage, FC HDDs, four controllers, 16 FC host ports, 12GB cache and RAID 1and 5. Software features such as LUN masking, array management with remote copy, D-2-D backup and snapshot may cost extra.
$2,000,000: HDS' 9980V include hardware features of 18TB of raw storage, FC HDDs, four controllers, 16 FC and 16 ESCON host ports, 32GB cache and RAID 1 and 5. Software features such as LUN masking array management with remote copy, D-2-D backup and snapshot may cost extra.

Modular management
Users familiar with managing monolithic arrays will be pleasantly surprised by the management software for modular arrays. Older monolithic arrays such as HDS' 7700Es or EMC's Symmetrix 8830s required users to gain a fair amount of expertise on the storage array and have an in-depth knowledge of the array's applications. Even then, some configuration changes such as moving LUNs from one port to another or reformatting disk drives on the array required a trained engineer. That engineer was often a vendor employee .

With older monolithic arrays, LUNs have to be carefully chosen and assigned to the application. If you place too many LUNs behind a single controller for a performance intensive application, applications may experience bottlenecks and degraded performance.

Before the advent of modern modular arrays, to proactively prevent bad performance, storage administrators needed to spend a fair amount of time understanding the application and carefully manage how the data on the LUNs got distributed throughout the storage array. This became more art than science because the administrator had to be expert in both the application and in the layout and management of the array.

Today's modular storage arrays go a long way in eliminating both problems. The new software removes the need for a vendor engineer any time a LUN needs to be moved from one FC port to another. Rather than writing special scripts or purchasing additional software to enable this task, some current management software makes this a point and click operation.

The software allows users to visualize which LUNs should be assigned to which FC ports. The software automates the actual assignment. While that doesn't prevent administrators from taking away LUNs already assigned to applications, it reduces the amount of time and technical expertise required to administer the storage array. For instance, EMC's Navisphere Manager--used to manage Clariion arrays--provides users with a browser based interface, permits administrators to create RAID groups automatically or manually and then allows LUNs to be created from these RAID groups.

It's also more difficult with monolithic arrays to place and move LUNs on the FC ports. Originally designed to support the mainframe environment, monolithic array LUNs usually get configured during installation and setup, and then are presented on individual storage array ports for use by the mainframe operating system.

This design strategy doesn't work in today's networked storage environments. Different servers with different operating systems may connect to the same storage array. In certain environments, servers with different operating systems access different LUNs on the same storage array FC port. Users of some monolithic storage arrays still need to verify with their storage vendor that a Novell, a Window and a Unix server may concurrently access different LUNs on the same FC port.

Some modular arrays address this unpleasant reality. The software on their arrays recognizes that even though most operating systems use SCSI to talk to disk drives, each OS has its own nuances in terms of how it talks to the disk. Most vendors now recognize this and have included code in the latest microcode levels on their storage arrays to address most if not all of these concerns. Vendors such as 3PAR, EMC, HDS, Hewlett-Packard Co. (HP), IBM, nStor and others now claim interoperability with all current, major releases of Unix and Windows operating systems for their storage arrays. However, users should still be cautious about presenting LUNs on the same FC port to multiple different operating systems and test that in their own environments.

Modular array features at different price points

Application fits
Modular arrays can host essentially any application that sits on a monolithic array including performance intensive ones. Take databases, for example. Relational databases produce random queries for data. The ability to retrieve data in the array from disk drives plays an increased role since it is unlikely the request will result in a cache hit. When the request goes to disk, factors like disk speed, back end architecture and processor speeds come into play more than the front end cache. So although modular arrays tend to contain less front-end cache, the larger cache memory in monolithic arrays does not improve performance since it is not used.

Files such as reference data, log files, and tar files rarely need high performance drives, but do need to remain online. Storage administrators should look at arrays as IBM's FastT line that support serial ATA drives.

Users at the small and midsize business level may even want to deploy ATA arrays in their production environments. Says The Evaluator Group's Kerns: "The performance on ATA arrays is a lot less, reliability is arguably less and I do not consider them midrange arrays but secondary. With that said, they absolutely have fits in the enterprise, especially in the small and midrange businesses since performance is typically not an issue."

More risk-adverse users should consider putting I/O intensive and 24x7 production applications on monolithic arrays. Many of the "gotchas" with monolithic arrays are documented with workarounds and patches readily available. This may or may not be true with the latest modular arrays. To avoid those risks but still take advantage of modular arrays, users should start to deploy them in three different ways. First, for important applications that don't require 24x7 availability such as evening batch jobs; second, for staging backup data to disk prior to moving to tape; and third, for storing reference data or infrequently accessed files that don't require expensive disk arrays.

Two software features currently offered by two enterprise class vendors will help to integrate modular arrays into an enterprise storage environment. HDS is the only one of the major storage vendors that uses the same software to manage both their monolithic and modular arrays. This means a user may deploy an HDS 9980 in their production environment and an HDS 9580 in a staging area for backup or offsite as a secondary recovery option. This provides a relatively safe way to introduce modular technology into the environment.

EMC's modular Clariion CX600 array offers a software feature that allows data at the block level to be migrated from any vendor's storage array to the CX600 or pushed from the CX600 to any vendor's storage array. This feature allows real-time data migration of data to and from the CX600 without the need to buy and deploy special software such as Veritas' Volume Replicator or Fujitsu Softek's TDMF Open at the server level .

Modular arrays are powerful new tools in the storage arsenal. High-end users should proceed cautiously and test the functionality of these arrays in their environments; small and midsize businesses should feel confident in deploying these arrays. While users need to keep an eye on vendor lock-in, networking complexity and management issues, they should not view these issues as obstacles to rolling out this latest generation of modular arrays.

Dig Deeper on Primary storage devices