One of the many decisions enterprise IT shops face if they want to develop a private cloud architecture for data storage is whether they should stick with their file- or block-based systems or invest in an object-based model that can scale to handle petabytes or perhaps even exabytes of data.
Using object storage generally translates into a focus on archival storage. Following the network-attached storage (NAS) or storage-area network (SAN) route can pave the way for primary storage of active data, typically through incremental steps that lead to a more virtualized, automated and policy-driven infrastructure.
That decision-making process could be frustrating for IT shops mainly because there's no consensus on exactly what a private storage cloud is and what it can do for organizations.
One camp comprises those who urge IT organizations to think like Amazon.com or Google when it comes to the delivery of storage. They advocate a self-service model, in which IT constructs a service catalog and guarantees levels of service.
More on private cloud architecture
Private cloud infrastructure technology options
The storage also needs to serve multiple tenants through a shared infrastructure and provide a chargeback option, or at least the ability to show each department how much storage it’s using. Additional attributes include massive scalability, location independence and an elastic quality to expand, and possibly contract, to accommodate geographically dispersed users if necessary.
“You don’t just go out to some vendor and buy it,” said Gene Ruth, a research director at Stamford, Conn.-based Gartner Inc., commenting on private storage clouds. “You’ve got to change your whole organization. You’ve got to change your whole line of thinking.”
Virtualization, automation play roles in storage clouds
But others have a looser interpretation of private cloud storage. They might ascribe some of the commonly cited attributes but not all of them. The term “cloud washing” has come into vogue to describe the vendor practice of slapping a cloud tag onto products they’ve sold for years without making any changes to the products themselves.
Amid the confusion, many have come to accept a limited view of private cloud storage and its ultimate potential to lower costs and ease the delivery of storage.
“When I think of cloud storage, or even cloud of any kind, I think from the virtual aspect,” said Robbie Hall, chief information officer (CIO) at Northern Hospital of Surry County in Mount Airy, N.C. “I don’t care, when I provision storage, which tray of drives it’s on. For me, not knowing or even caring where the storage resides defines the cloud aspect of it.”
Northern Hospital’s “private cloud” infrastructure consists of virtualized servers from VMware Inc. coupled with an EMC Corp. Clariion SAN and VPlex Metro appliance, which enables data access and movement between different geographic locations. Built-in SAN automation directs the data to high-speed Fibre Channel (FC) drives or high-capacity SATA drives, Hall noted.
But is Northern Hospital’s automation and virtualized infrastructure enough to constitute a private storage cloud? Or does it simply represent incremental steps? Hall has no interest in the self-service provisioning of storage or in charging departments for the storage they use, although he said he might like the ability to track how much each department uses.
“Cloud is in the eyes of the beholder for the most part,” noted Marc Staimer, president of Dragon Slayer Consulting in Beaverton, Ore.
Object storage’s role in the cloud
What is clear is that the Northern Hospital example is a far cry from the grand vision shared by those who equate internal cloud storage with a privately controlled variant of Amazon’s Simple Storage Service (S3) or prominent public cloud offerings from the likes of Google and Rackspace Hosting Inc.
At the heart of their cloud services are homegrown object stores that use clusters of commodity servers to scale to store enormous quantities of data tagged with unique identifiers or files with metadata. But the question of whether mere mortals can build such systems, or whether the object approach is the only way to achieve the goals of private cloud storage, remains open.
Last year, Rackspace gave potential users an intriguing option when it kick-started an open-source project called OpenStack, with the release of code that powers its public cloud storage. An OpenStack conference in October drew more than 600 attendees to Boston, and the list of speakers using or piloting the technology included CERN, Disney, Fidelity, Sony Computer Entertainment America and Latin American online trading firm MercadoLibre.
“With an object-based system, the storage system has more control over writes,” said Jonathan Bryce, co-founder of Rackspace Cloud and chairman of the OpenStack project policy board. “Whether it’s a network file system like NFS or CIFS, there has to be a locking concept involved, meaning only one person can write at a time to any particular piece of data.
“In order to enforce that lock," he continued, "the system has to know about all of the traffic going on everywhere, on all of the files that are stored in it. That becomes a scalability problem.”
Bryce said NFS-based storage, for instance, can handle a large volume of traffic but there’s a limit to the number of concurrent sessions. The systems can also become expensive to build out from a hardware standpoint in a high-scale, multi-tenant way, he added.
“When you go to an object-based storage system, you restrict some of the file-system-type functionality, but it enables you to scale much, much larger,” Bryce said. “Also, because the system is now in control of every write, it allows you to let the system make redundant copies of it from a software perspective as the requests are coming in, instead of trying to do it on the back end with RAID arrays or SAN LUNs.”
He added: “That’s why Amazon built theirs the way they did -- and why we built ours the way we have. It’s really to avoid the cost and the requirements that come when you have to introduce locking and a central controller.”
Is object storage required for clouds?
Some analysts and consultants view object stores as the optimal, if not only, choice to meet the definition of private cloud storage -- especially for static data such as virtual machine (VM) images, video, photos, and data archives or backups.
“Every other form of storage has hard limits on how they scale, and most of them don’t even get into more than a handful of petabytes in the largest configuration,” Dragon Slayer Consulting's Staimer said.
Staimer takes an all-or-nothing view of private cloud storage. He believes a product either meets all the many attributes of his definition or it doesn’t qualify. So, in his estimation, object storage is the only technology that currently works.
“For the most part, the cloud is aimed at archival [storage],” Staimer said. “It’s meant to be a data repository. It’s not meant to be everyday storage.”
Howard Marks, chief scientist at DeepStorage.net, said a private storage cloud can’t handle the sort of number-crunching online transaction processing (OLTP) and database applications that people commonly associate with primary storage.
“You can’t have data in multiple locations and sufficiently low latency to satisfy primary storage applications [such as OLTP and databases],” he said.
However, Boston-based consultancy Cloud Technology Partners Inc. (cloudTP) has used NetApp Inc. SAN, Nexenta Systems Inc.’s NexentaStor and OpenStack’s Object Storage (code-named Swift) products for what it considers primary storage in private cloud projects.according to Beth Cohen, a senior cloud architect at cloudTP.
Cohen said each project also involved a compute cloud, which she views as an essential element to achieve the level of efficiency the cloud represents. Her current focus is figuring out the appropriate infrastructure for private cloud storage.
In Cohen’s estimation, the infrastructure for private cloud storage needs to include some type of object or virtualized SAN storage, network technology, a hypervisor on commodity servers and an orchestration layer. That orchestration layer allows users to manage the self-service components, the VM images and server instances on the cloud.
However, storage for cloud computing isn’t the same as a storage cloud, which serves purely as an alternative for on-premises storage.
“I view private cloud as a way of thinking about the company’s infrastructure,” Cohen said. “It’s more a set of principles than the nuts and bolts of it, because the nuts and bolts of the cloud have been around. They’re just put together in a different way.”
Storage clouds still waiting for nirvana
Greg Schulz, author of Cloud and Virtual Data Storage Networking and founder of the StorageIO Group in Stillwater, Minn., pointed out that users have the option of bundled packages of servers, storage, network and software, such as EMC’s Vblock and NetApp’s FlexPod for storage clouds. Or users can put together the components themselves or leverage existing resources, he said.
“It comes down to how much time you have to integrate, what level of interoperability you need with your current environment, if you're starting with a clean sheet of paper for a new site, as well as application and business needs,” Schulz said via email.
Other analysts say not all the building blocks of a storage cloud are available yet. Gartner’s Ruth said many of the technology pieces are currently missing, especially at the upper end of the software stack, to facilitate the sort of automation, chargeback and hands-off operation that private clouds require.
Arun Taneja, founder and consulting analyst at Taneja Group in Hopkinton, Mass., draws a distinction between private clouds built for archives with object-based products (such as EMC Corp.’s Atmos or NetApp’s StorageGrid) and private clouds used for primary storage. He believes, at the moment, the archival variety has more of the technology pieces in place for private cloud storage.
“We’re heading toward it, but in my view, nobody has a private cloud for primary storage,” Taneja said. “I’m not making a value judgment that one has to complete every aspect of a private cloud in order to get many of the benefits. But as you get closer and closer to the true definition of a private cloud, then you get closer and closer to nirvana.”