LAS VEGAS -- When planning your next data center, EMC Corp. wants you to think along the lines of what Google,...
Amazon and Facebook have done with theirs. That was a key message from the vendor at EMC World 2013, where EMC offered its ViPR software-defined storage platform as the first major step in the direction of a Google storage model.
"You can build a Web-scale data center without hiring 1,000 Ph.D.s or rocket scientists," said Jeremy Burton, EMC's executive vice president of product operations and marketing. "The Googles, Facebooks and Amazons have built a data center for one -- just for their environment. We're building a data center for everyone."
"We built it for people who want to run their data center like Google, but [who] don't want to write and maintain their own custom environment," EMC president Dave Goulden said of ViPR.
Building Google storage requires Hadoop or similar frameworks that Google, Facebook and Amazon use. EMC promises ViPR will manage data as a pool across any type of storage and handle Hadoop workloads. It has also given its Isilon clustered network-attached storage platform tight support for the Hadoop Distributed File System.
But EMC's message about a new data center goes beyond storage. EMC-owned VMware and Pivotal played big roles at EMC World this year, trumpeting their roles in the software-defined everything world.
EMC CEO Joe Tucci spoke of a transition from a PC and server-centric data center to one of mobile devices, the cloud, big data and social networks. He said the new data center will require software-defined compute, networking and storage. "The new buzzwords are abstract, pool and automate," he said. "This is what the software-defined data center is about at the highest level."
Pivotal CEO Paul Maritz also used the Google storage model in his example of what the new data center should look like. "The consumer Internet giants have much larger information stores," he said. "When Google set out to index all the information on the Internet, it had to innovate and build a new architecture." And the key to that architecture -- the Google File System -- is a form of object storage, he said.
Hadoop avoids vendor lock-in, but at what price?
Juergen Urbanski, vice president of big data and cloud architectures for T-Systems, the $13 billion IT services division of Deutsche Telekom, is building that type of data center. Urbanski said his data center has more than 20 PB of data and "right now we see a very dramatic disruption happening in storage, primarily driven by Hadoop."
He said the difference between what he is doing and what the Internet giants do is that his workloads are far more diverse because his clients come from a wide range of industries.
Urbanski said T-Systems is a large VMware customer, but doesn't use much EMC storage. He said the telecom has traditionally been a NetApp shop, although it has "storage from all kinds of vendors." However, he does use Isilon to run some of his customers' workloads because of its Hadoop capabilities.
"Our firm belief is that by 2015 … 80% of new data that comes in will land first in Hadoop," he said. "What we don't know is what the physical infrastructure underneath will be -- is it DAS, is it Isilon? We don't know. But at the data management layer and the distributed file system, it's Hadoop."
One thing he is trying to determine now is whether enterprise storage adds enough value to Hadoop clusters to make it worthwhile to run on more expensive storage rather than commodity white boxes.
For example, Isilon brings better data protection and performance while adding multi-tenancy that is missing with Hadoop on commodity storage.
"Multi-tenancy is important for us," he said. "Does Daimler want to have its data on the same cluster with BMW? Probably not. But they're both customers. So multi-tenancy is a core design point for us to offer. Today, Hadoop by itself does not support that. So what's the right tradeoff between open source -- commodity if you will -- and something that arguably has higher Capex and vendor lock-in, but brings other benefits?"
He said those tradeoffs will also be an issue with ViPR, which EMC claims will fully support commodity and third-party storage as well as its own arrays.
"With ViPR we can have one control plane and that helps us manage storage across pools," Urbanski said. "That part I buy. The $64,000 question is, 'How much of that pool will be white box stuff?' For EMC, that boils down to a fairly big business model issue. If you're a sales guy with a $20 million quota, how do you make up your quota based on what's probably a few hundred thousand dollar deals just for software? That's a big disruption in the industry, and that's the thing staring people in the face."
Dig Deeper on Data management tools