WavebreakmediaMicro - Fotolia
Cloud storage newcomer Datera Inc., which claims it shipped 2 PB of raw capacity in its first revenue quarter, recruited a former Pure Storage Inc. executive to help grow operations.
Gurpreet Singh, a 20-year industry veteran, in May became Datera's president. Singh previously spent almost four years at all-flash array vendor Pure as vice president of product management.
"Where I see the market headed is easy-to-consume, grow-as-you-go, automated infrastructure that plugs seamlessly into the new-age orchestration models -- OpenStack and containers and the DevOps model," Singh said. "And Datera's product is aimed squarely at that."
Datera emerged from stealth in April. The Sunnyvale, Calif., startup raised $40 million from Khosla Ventures; Samsung Ventures; Andy Bechtolsheim, co-founder of Sun Microsystems; and Pradeep Sindhu, founder and CTO of Juniper Networks.
We spoke to Singh and Marc Fleischmann, founder and CEO of Datera, to discuss the company's Elastic Data Fabric, technology direction, the competitive landscape and how application developers are driving key decisions in the data center.
Can you provide an update on what's happened since you launched your first product?
Marc Fleischmann: We shipped 2 PB of raw capacity in the first quarter, so our first revenue quarter is behind us. Obviously, we are continuing to grow the company. We decided to significantly extend our go-to-market [GTM] capability. Gurpreet is joining us from Pure Storage, where he was the vice president of product management, which really understates his role quite dramatically. He built the whole GTM strategy and a business pipeline for Pure. We are super excited to have him here at Datera.
Gurpreet, what lessons did you learn at Pure Storage that will work to your advantage at Datera?
Gurpreet Singh: From a market standpoint, the data center of tomorrow is headlined by application developers, and not really by IT. Every enterprise wants to consume infrastructure in that easy, automated, true grow-as-you-go model. Those big-iron, hardware-optimized storage arrays don't lend themselves very well to that model. Customers want to be able to benefit from Moore's law by relying on commodity hardware.
How would you compare customer trends you saw at Pure to what you are seeing at Datera?
Singh: The Pure products and customers were very different. The use cases were different. The workloads and the applications were different. For example, the Pure solution was very horizontal. Basically, anybody who acquired the block storage and had an IT organization that could install, rack, stack, cable, configure and manage it, and had a finite need of a specific capacity and performance, would go with that. The applications would be large, transactional OLTP [online transaction processing], big-iron databases, Oracle, SAP, ERP solutions. [Pure] had great success in that segment.
But when the customer wanted to deploy the same product in a large-scale, hyperscale, automated private cloud kind of implementation, that's where a monolithic, scale-up storage array model fails, because customers do not want to prepurchase. They do not very often know what the size of the environment is going to be. They want to be able to consume it not through clicks of buttons, but through code and through APIs. And they are supposed to be able to plug into the new paradigm cloud orchestration models, such as OpenStack, CloudStack, Kubernetes, Mesosphere and Docker Swarm. That almost requires a different kind of product.
Do you think there's still a lot of demand for what the all-flash array vendors like Pure do?
Singh: I'll answer that question with a disclaimer. I am still a Pure stockholder. I still think there is room for it. There are tons and tons of mission-critical, business-critical apps that are single apps that customers will want to deploy on prem. And they will spare no means and no expense to be able to solve their performance problems. The EMC VMAX franchise is pretty big, as well.
So, that market is still there. And while it's flat and will continue as flat, it's a significant, sizable market and still has significant run rate. However, this new market that Datera is going after is also fairly large -- perhaps not as large as the current one, but it's growing at significantly high CAGR [compound annual growth rate]. I think by one estimate, it was 40% year over year CAGR on the private cloud-based deployments, whereas the other market, while it's perhaps 10 times larger, it's largely flat for the next five years.
What's the biggest problem with the traditional block storage systems that people have used for years?
Fleischmann: They don't scale. You can buy a traditional block storage array, two heads. You can add some shelves, and pretty soon, you're done. And by the way, if anything fails, you need to plan for that. You need to plan the capacity of that array. The planning is the wrong model. The provisioning is the wrong model, because you have an array with two heads that needs manual configuration. It needs handcrafting of LUNs, which is a cumbersome, time-consuming process that does not keep up with the speed of cloud, especially if you look at containers going forward. If any one of those heads fails, you have to replace the head, and you have to react immediately. It's not a fault-tolerant environment that just accommodates failures by reconfiguring itself around it. If the hardware becomes obsolete in three or four years, you have to do data migration. You don't have the smooth process that a true scale-out solution offers by basically just adding nodes on one side and gradually decommissioning nodes on the other side, and rebalancing data across even generations of hardware of those nodes continuously. So, it's a completely different operational model.
Why did you decide to tackle Amazon Elastic Block Store-like storage?
Fleischmann: Bringing the kind of cloud quality, automation and economics to the broad enterprises and service providers, SaaS [software-as-a-service] providers, financial service providers is a problem that has not been solved yet -- at least well. So, if you blend block storage with 21st century cloud automation, you have a highly differentiated product.
We call it profile-driven or policy-driven. It automates block storage at scale seamlessly tying into modern orchestration frameworks, such as OpenStack, CloudStack, Docker Swarm, Kubernetes and soon also Mesosphere.
At a certain level, it is just standard block storage, as well. So, it works with existing environments. We are VMware-certified. It's iSCSI or [iSCSI Extensions for RDMA] iSER. It's compatible to any block storage requirement out there. You don't have to use all the automation capabilities, but the full power of the product comes into play once you start doing that.
The public cloud model -- the hyperscale, the automation model, the economic model -- that Amazon, Google, Azure, [what] these guys have developed is challenging the rest of the IT industry to match. Or, workloads like that are moving to the cloud.
What does your software do?
Fleischmann: It is a scale-out storage system. It has the data plane to store massive amounts of data at high performance and great economics. But it also has the management and control plane to provision, manage, rebalance, optimize, tune [and] self-tune the storage over time. It's ultimately self-optimizing based on application needs and continuously adjusts to this changing environment.
It provides very high performance, infinite snapshots and clones. It is thinly provisioned storage. It provides replication, data reliability [and] quality of service. It's a full storage solution.
Is your product intended for use only on premises?
Fleischmann: For the time being, it's only on premises. We are obviously working on a multicloud or a cloud converged strategy. The world is going to be more than on prem. It's going to converge. It's going to be a multicloud world. And we recognize that.
For what workloads is your product designed?
Fleischmann: It's a broad gamut of workloads. We encapsulate the workloads in profiles that we can then automatically instantiate at scale. And the gamut of workloads right now [includes] OLTP from traditional databases like MySQL -- not Oracle, yet -- all the way to big data analytics and real-time analytics workloads, like Hadoop or more modern Kafka in Spark. And then, obviously, you have all flavors in between -- NoSQL databases, like MongoDB, CouchDB, Cloudera [and] Hortonworks.
Who are your competitors?
Fleischmann: In the cloud space, it's really only two peer groups that we are seeing. One is [NetApp's] SolidFire. That's an all-flash array. For the time being, we are a multi-tiered array. SolidFire is very expensive. Our price point is less than 10% of SolidFire on a dollar-per-GB raw comparison basis.
On the other end of the spectrum, we typically see Ceph. In theory, they are free, if your time is free. In practice, however, it takes a considerable amount of time to keep it operating, to scale it, to tune it [and] to optimize it to all these continuously changing environments.
What's your take on hybrid versus all-flash for primary storage?
Fleischmann: You've got to have both ... We use the NVMe flash tier to provide the performance layer, and we use hard drive tiers to provide capacity. People put petabytes of data into cloud. Only a small fraction of that data is actually active and being used. So, it's actually a very economical and performance-efficient usage model to put a lot of the data that's really cold and not used a lot on hard drives and just let it sleep there while you serve the hot workloads from NVMe flash.
Making the case for cloud storage: Decide whether to take the plunge
Cloud storage performance is improving
How to use flash to obtain better cloud storage performance