SearchStorage.com: How did you end up in storage?
Golub: At Plaxo I became exposed to some of the problems that a lot of folks were facing with unstructured data, in the sense that you have millions of users sharing data like videos, photos, as well as smaller data like text messages. Plaxo was then acquired by Comcast, and as we were helping Comcast start to build out their solutions for IPTV, we began serving high-definition videos to millions of customers, and began to see that [storing these files] is a big and growing problem. You look across not just Web 2.0 and consumer media, but healthcare, oil and gas, scientific data sets, you name it, this has become a huge problem.
SearchStorage.com: Why Gluster?
Golub: Gluster has come up with a fundamental way to solve this problem, and if the promise is correct it could be something really significant. My sense is not only [is Gluster] trying to solve a really big problem, but trying to solve it in an innovative way. We've seen the move within the data center over the past 10 years from islands of computing, big proprietary systems that don't scale particularly well, to being virtualized, to having great utilization and having open source solutions running on commodity hardware. But while the data center has moved forward and unstructured data has exploded over the past 10 years, if you look in the storage space, there are still largely islands of storage, still largely proprietary systems, and still huge scaling issues.
SearchStorage.com: Before Oracle bought Sun, Sun grappled with the question of how to make money from open source storage. Now Oracle is looking at how to monetize everything. How do you plan to build a successful business around open source software?
Golub: I think it is a very difficult question to answer if you are in the same position as Sun, where you have a multibillion-dollar business based off of proprietary hardware and proprietary software, and by embracing open source you are cannibalizing a large existing base of your own business. If you're starting out open source, like a Red Hat or a MySQL or Lustre, I think the story is different, because then you can build a business model that is purpose-built for open source. In our case, we're fortunate in that we not only have [that] business model, but also because we created the open source code and have the copyright to it. While we fully embrace open source and have the full copy left out there, we also have the opportunity to go after things like OEM licensing models.
Golub: In keeping ourselves as a low-cost business we essentially have 1,000 developers who are banging on the product, developing new functionality for the product, and we certainly can't replicate in-house the environment of a large media company and a large oil and gas company and large universities. Our members do that for us, and also help in addition with marketing and evangelizing the product and doing support for each other. Also, it is up to us to offer a paid product that provides significant value above and beyond what people can get with the free version of the product. And that's where you look at kind of the models that a MySQL or Red Hat pioneered, where there's a free and open source version and a paid but still open source vision. The source code remains available, but there are value-add services on top of that, both functionality and support.
The view of open source has changed from being something that's risky to being something that's lower risk for an enterprise for two reasons. One is there are no secrets — the source code is out there and has been inspected and banged on by lots of people. The second is that as an enterprise you are not locked in to that vendor. Whether people are concerned about our viability as a company or the functionality, by going open-source if there's functionality they want that we aren't delivering -- like an internal billing module -- they can develop it themselves. If they want to take it over from Gluster and develop on top of that, the flexibility is there.
The functionality of the file system is modular. It's developed more like applications are developed, using APIs to write modules [to add functionality] in user space. Customers are writing their own modules — by the time they come to us for something, they're often already doing it.
SearchStorage.com: So how many of those people in the thousand-plus community are paying customers?
Golub: We've got over 60 paying customers at this point and about 300 production deployments that we're aware of. About 90% of our paying customers come from the community, and once they put it into production they buy a subscription license. So our customers are doing their own qualification and self-selection and testing.
SearchStorage.com: Who do you see as your main competitors? What makes you win deals and what makes you lose them?
Gluster VP of marketing Jack O'Brien: We see two types of competitors. ParaScale is an example of the type of competitor that's just trying to be a cloud vendor. And then we see [scale out NAS] companies like Isilon and Ibrix [now part of Hewlett-Packard Co].
Golub: I think we win deals by having both a better answer in terms of scalability from performance and capacity, and by not being proprietary, which means better cost performance but more importantly more flexibility for the end user. Why do we lose deals? Because we're not as well known, and that's something we're obviously trying to change.
It's becoming increasingly clear that storage is a software problem, and it needs a software solution. I think we're all seeing enterprises saying, 'if we're going to get a software solution, we want one that's open and flexible'."
SearchStorage.com: Do you offer the kind of traditional enterprise features some of those other companies offer, like snapshots and distance replication?
Gluster co-founder and CTO Anand Babu Periasamy: We have the synchronous replication already. Today, users use standard protocols like RSync to replicate our volumes [over distance], but we are going to be delivering a feature in the fourth quarter called continuous WAN replication, because our system can intelligently replicate only the part of the data that has changed and can do it in the background asynchronously.
We are introducing cloning and snapshotting functionality in release 3.1 in July, but this functionality is different from what users already have in terms of volume-level snapshotting and cloning. It's going to be file-level or directory-level snapshotting. Users today store virtual machines [VMs]. If you've got a global namespace and a huge volume, thousands of virtual machines is pretty much a starting point in our space — users create VMs every few seconds. They can't afford to do snapshot and cloning on the huge global namespace — they want to pinpoint a virtual machine and clone it multiple times, a hundred times — that's the kind of functionality we're introducing.
For more on the deeper technical details of Gluster's scale-out file system, see our deep dive with Periasamy on the Storage Soup Blog.