| We pick the five must-have storage technologies you'll want in your data centers next year.
To be fair, winning products and technologies (those that change the way a storage manager works to store and protect storage, and retrieve data) aren't easy to spot once they move from labs and beta versions into production environments. To identify the hot technologies for 2008, the editors of Storage pored through scores of new storage products and technologies before arriving at five that promise to make storing data more efficient or eloquently solve a nagging data center problem. This year's picks include LTO-4, which adds increased capacity, speed and AES-256 to tape systems; N_Port ID Virtualization (NPIV), which allows multiple virtual devices to share a single physical Fibre Channel (FC) port; and deduplication, which can drastically reduce the amount of data to be stored or backed up.
The editors also identified two hot technology areas: ediscovery, which is being driven by the latest Federal Rules of Civil Procedure (FRCP); and green storage, a response to soaring electric bills. We also list five technologies we feel need more time to mature and consequently won't be hot in 2008 (see "Not so hot in 2008"). Finally, we review how the predictions we made last year fared (see "Report card: Last year's predictions"). In the discussion that follows, specific vendors or products are mentioned solely as being representative of the particular category.
LTO-4, the latest iteration of the LTO tape format, was unexpectedly quiet at the time of this writing. Few products have actually started shipping in volume.
"IBM [Corp.] has been shipping products for a few months and HP [Hewlett-Packard Co.] recently started shipping products, but the LTO-4 products have been slow in coming," notes Greg Farris, director of marketing at CipherMax Inc., which provides an appliance to assist with the migration of LTO-3 to LTO-4 and, as of late 2007, was still waiting for the LTO-4 market to ramp up.
But given the frequency with which data tapes seem to fall off delivery trucks, and the proliferation of laws like California's SB 1386, which requires public disclosure when unencrypted private data is potentially exposed, companies have little choice but to adopt tape encryption. LTO-4, which provides built-in AES encryption, is the most likely candidate for the job, although key management is still a work in progress.
"Encryption is the only reason to go with LTO-4," says W. Curtis Preston, VP of data protection at GlassHouse Technologies Inc., a consulting firm in Framingham, MA. LTO-4 also doubles capacity to 800GB and boosts the data transfer rate to 120MB/sec, but those are secondary reasons compared to the encryption imperative.
Still, it was speed and capacity that led Patillo Construction, Stone Mountain, GA, to urgently seek a new LTO-4 device from HP. "We have a very tight overnight backup window and were running out of time," says Buzz Kaas, director of information technology at the design engineering and construction company. With HP's StorageWorks LTO-4 Ultrium 1840 Tape Drive, Patillo Construction can now back up 500GB of data in two hours, which fits into its backup window.
| Ironically, encryption wasn't a consideration for Patillo Construction. "Now that we've had the drive for a while, we're starting to think about encryption, but it is not a priority," adds Kaas.
LTO-4 encryption may not be a slam dunk, at least not immediately. "It comes down to three things: key management, key management and key management," says Preston. Security best practices require each individual LTO-4 tape to be encrypted with a different key. Over time, there will be thousands (even tens of thousands) of keys. A lost key means data is gone forever. Therefore, each key has to be protected, yet still be available under all circumstances. That means secure key replication, redundancy and backup on a very large scale. Complicating the situation is that each LTO tape vendor will have its own key management scheme and you can be sure, at least initially, they won't work together.
However, work has started on standards for key management and interoperability. Specifically, the Trusted Computing Group's (TCG) Key Management Services Subgroup (KMSS) has been working for a year on an Enterprise Key Management Infrastructure specification designated T10. Other groups are pursuing related work. The results should begin appearing in 2008 or 2009.
| NPIV tackles the problem of how multiple virtual servers get access to the FC SAN. Usually the SAN wants one port ID for each server. It uses the port ID as the basis for masking and zoning. However, "virtual servers share the physical HBA and get a virtual port ID," explains Clodoaldo Barrera, distinguished engineer and chief technical strategist for IBM's System Storage Group. NPIV defines how multiple virtual servers can share a single physical port ID.
For storage managers, NPIV will be almost transparent. "To get the benefit of NPIV, it must be supported in the HBA and the switch," says Scott McIntyre, VP of software and customer marketing at Emulex Corp. HBA and switch vendors have started building NPIV into their products. Companies will get NPIV as they upgrade their HBAs and switches.
NPIV also needs to happen in the virtualization software. Microsoft Corp.'s Virtual Server began supporting NPIV in 2007. VMware is expected to support NPIV in 2008. IBM supports a version of NPIV for its System z and is working it into its blade servers, which presents a somewhat more complicated technical challenge. But again, this is something vendors, not corporate storage managers, are wrestling with.
This isn't to say storage managers can ignore NPIV. To take advantage of NPIV, they'll have to check the latest hardware, firmware and server operating system versions for NPIV support. Once deployed, storage administrators will need to be aware of NPIV because it may impact how they set up the fabric, handle zoning and masking, and manage quality of service.
"Deduplication is the single biggest technology to come along in years," declares GlassHouse Technologies' Preston. Deduplication identifies and removes multiple copies of data that eat up storage capacity and lengthen backup windows.
By reducing the amount of data to store and move around, deduplication frees up capacity and bandwidth, and slows the rate at which a company needs to add storage. Less storage means lower costs. It also means reduced energy consumption.
Integral Capital Partners, a private investment firm in Menlo Park, CA, turned to deduplication to speed its overnight backups. The company operates facilities in Menlo Park and Baltimore, stores data locally and replicates data between locations nightly. "The process took eight hours," recalls Jason Paige, information system manager.
Paige began looking for faster options early in 2007. The few he found, however, "wanted disk to emulate tape," he recalls. "I didn't want that." Finally, Paige found Avamar, which has been acquired by EMC Corp. Avamar splits files into segments, applies deduplication and sends only the data that has changed, he explains. Integral Capital Partners replicates 300GB to 500GB nightly over a T1 line. The Avamar devices themselves can store approximately 2TB of data. "With Avamar deduplication, we were able to cut our backup time to about 45 minutes," says Paige.
"Deduplication already is game-changing technology," says Arun Taneja, founder and consulting analyst at Taneja Group, Hopkinton, MA. "It has gotten to the point where every VTL [virtual tape library] includes deduplication."
Despite enthusiasm for deduplication, there are some potential pitfalls. "It can slow recovery," says Greg Schulz, founder and senior analyst at StorageIO Group, Stillwater, MN. "Also, the data reductions are very dependent on your particular data."
If you haven't encountered the acronym ESI (electronically stored information) yet, you will--especially if your company is slapped with litigation. ESI has been elevated to the same level of importance as traditional paper documents in litigation, according to the latest FRCP. Most state rules will soon conform to the new FRCP.
Ediscovery is the process of sifting through ESI in search of smoking guns. As soon as litigation has been filed or even anticipated, organizations must preserve, protect and make accessible all data associated with the litigation. "The changes in FRCP are driving interest in ediscovery," says George Socha Jr., founder of Socha Consulting LLC, St. Paul, MN. Beyond search, ediscovery tools and services also address related processes like legal hold management, which is central to preserving ESI for litigation; retention management; case management; workflow; and document management, which are all part of the litigation process.
Socha publishes an annual survey on ediscovery and tracks the vendors in the market. "At last count, there were at least 600 organizations offering ediscovery services or products," says Socha. He tallied more than 800 recently, but a number of them disappear each year.
Not surprisingly, Socha's list of top ediscovery vendors has a legal industry orientation and includes companies like Guidance Software Inc., Kroll Ontrack Inc. and Zantaz Inc. (recently acquired by Autonomy Corp.). Other ediscovery tool vendors, such as Index Engines Inc. and Kazeon Systems Inc., may be more familiar to IT folks, but don't even appear among Socha's Top 20.
| Onsite3, an Arlington, VA-based ediscovery service provider ranked 18th on Socha's list, uses the Index Engines tool to find data buried on backup tapes. "Index Engines lets us index a full tape and run a search without ever having to restore the tape. We only restore the actual files we want," says Jeff Fehrman, president of Onsite3's Electronic Evidence Labs (EE Labs). Compared to restoring entire tapes, Index Engines, which EE Labs deploys as an appliance, "cuts out a huge amount of data and time. We're talking about one-tenth the amount of data," says Fehrman. The savings in time and space are huge. "We have pharmaceutical and financial industry clients with hundreds of thousands of tapes to search," he adds.
TiVo Inc. in Alviso, CA, turned to Kazeon to bolster its legal defenses in the event of an ediscovery request. "We saw that ediscovery could be costly if not well managed," says Karen Kramer, TiVo's director of legal affairs. "Prior to deploying Kazeon, we did things on a piecemeal basis, depending on the request."
The company licensed Kazeon, which recently announced a partnership with PSS Systems' Atlas, a leading suite of legal hold and retention management tools. TiVo runs Kazeon on its own server and uses it to handle ediscovery across the company's Unix, Windows, Mac and Linux systems. "It takes multiple sources of information and consolidates them into one search," explains Kramer. And the cost is reasonable. "Others will do it as a service and charge a lot more money," she adds.
| Green storage
"Green is not a fad ... energy costs will exceed the purchase cost of a server in three years," says Bruce Taylor, chief strategist for energy efficiency and productivity initiatives at the Uptime Institute Inc., Santa Fe, NM. "Storage historically has been less of a problem, but as storage demand escalates, energy will become more of a problem there, too."
The storage industry has barely begun to think through the energy implications of storage or even ask some obvious questions. "Why keep platters spinning for data that is rarely if ever used? Why not shove it onto tape?" asks Taylor. To be green, IT must rethink its approach to data storage (see "Ways to reduce storage energy consumption," below).
When managers of the Epilepsy Phenome/Genome Project (EPGP) at the University of California in San Francisco began planning a data center consolidation, they didn't have storage energy savings specifically in mind. Still, they saw opportunities.
| "We had multiple sites. Our average utilization of DAS was 40%, domain controllers were wasting storage, analysis servers were wasting storage," says Michael Williams, CIO at the project.
The organization embarked on consolidation, virtualization and thin provisioning built around Network Appliance Inc. storage. In the process, it was able to consolidate 10 racks of servers into two racks. Using two deduplicated storage pools, it reduced each of its 150TB pools of provisioned storage to 25TB of physical spinning disks. "Being environmentally friendly is the result of doing efficient systems," says Williams. "We were able to reduce our environmental footprint by a factor of five."
Even if you don't deploy any of these hot technologies this year, you can't avoid them. Green computing, litigation and data privacy are issues that won't go away. Deduplication, NPIV, LTO-4 and ediscovery address these issues. Maybe you won't get hit with a lawsuit, experience a lost backup tape or see energy bills go off the chart in 2008, but eventually you will. And it's comforting to know that there are technologies to help you deal with these issues.
- Data Management Strategies for the CIO –SearchDataCenter.com
- Data Protection Strategies in the Era of Flash Storage –Rubrik
- Preparing a database strategy for Big Data –SearchDataManagement
- Cloud Storage for Primary or Nearline Data –SearchStorage.com