Storage vendors are building block-based tiered storage into their products, but some users are asking why the combination of block-level granularity and file-level automation isn't a part of large, high-end, heterogeneous tiered storage systems.
"Tiering data for performance management is a struggle when it crosses the bounds of a controller," says Jeff Boles, IT manager for the City of Mesa, AZ. "Block-level data in and of itself is never going to be completely indicative of access patterns because that's not the way we work."
Vendors say products simply don't exist today that could keep a single block within a larger file on primary storage while sending the rest of the file, block by block, to a separate file system on a separate physical tier, bringing the blocks back together when different parts of the file are accessed again.
Midrange arrays like Compellent Technologies' Storage Center SAN and EqualLogic's PS Series have begun baking in block-level, tiered storage functions and nearly every high-end storage vendor offers a product that will automatically migrate data at the file level. Virtualization products such as Hitachi Data Systems' TagmaStore Network Storage Controller (NSC55), IBM's SAN Volume Controller and Incipient's Network Storage Platform, can migrate data at the block level with some automation, although they can't yet manipulate file-level context for data and are still highly interactive when it comes to data management.
But Boles believes in the possibility of adding file-level coherency to block-level, tiered performance management, possibly through network-based virtualization.
"Looking at the application of this technology outside of [what we do with it today] gets pretty scary," admits Boles. "But when we get caught up looking at immediate easy applications, we're missing incredible long-term strategic value, and that's the ability to shape our storage in a meaningful way."
Some vendors remain uncertain about melding file-level meta data with block-level granularity. The inability to provide context for data at the block level, says Jon Affeld, director of product marketing at BlueArc, "is part of the fundamental difference between block and file storage. A block is only ones and zeroes in a container, and a file system can't see anything inside that container. That will never change."
But others are more hopeful that such products might be available this decade. The building blocks, they say, could emerge in the next 12 to 18 months.
"I am aware that there are investigations and research [at Hitachi] into this issue, and I have been asked by customers about it," says Steve Smith, product marketing manager, enterprise storage at Hitachi. Smith hinted that a collaboration with Sun Microsystems, which markets file-level migration in its StorageTek SAM-QFS software, could be integrated with Hitachi's NSC55.
Brad O'Neill, senior analyst and consultant at the Taneja Group in Hopkinton, MA, says he believes it's possible, "but only in a completely virtualized IT stack" controlled by intelligence above the array, a prospect that calls attention to the behind-the-scenes tug-of-war between storage and network vendors.
"Viewed purely from the storage systems level, many vendors would be hesitant to enable this kind of broad-based heterogeneous virtualization," says O'Neill. "It amounts to saying, 'Please help me lose account control.'"
But, he adds, "in a [system] where both the block-level storage and the file-system layers have been federated for shared data access ... any file system can provide data access into any block of storage on a networked storage infrastructure. This is possible, feasible and done regularly today in many clustered database environments and clustered NAS environments."
McData, along with its anticipated new owner Brocade, is working to develop what McData executive VP and COO Todd Oseth calls "network-commoditized storage," in which all data management services are performed by intelligent network nodes.
"We're talking about another 100 bytes attached to the block at the point of creation, something that could be done within an intelligent network as part of, say, a SCSI command expression," says Oseth. "I think we'll have the core platform technology in 2007--intelligent I/O processing." Shortly after that, he says, data classification and management products that operate on media servers attached to storage today will make their way to the network, forming the foundation for the fully virtualized IT environment described by O'Neill.
"I believe we will get there," says Boles with the City of Mesa, AZ. "Maybe not for another year or two, but I believe it's on the horizon."
- Taming Hadoop: Storage Tiering for Big Data –Western Digital
- The Emergence of a New, Highly Flash-Optimized Tier 1 Storage Platform –Pure Storage