Newly released Red Hat Gluster Storage 3.2 software boosts performance, deepens container support, introduces a capacity-saving option for data integrity checks and enhances monitoring capabilities.
Performance-focused improvements, such as client-side metadata caching, could enable certain operations to run as much as eight times faster, according to Ross Turk, director of product marketing at Red Hat. He said users would notice a significant drop in the wait time for metadata to come back, commands to traverse directories and various other functions.
"If you had only 100 files in a directory, you would not notice one single difference. But as soon as you put in thousands of files -- sometimes up to a million -- that's when you start seeing differences," said Sayan Saha, head of product for Red Hat Gluster Storage, container storage and storage management.
Saha said Red Hat Gluster Storage 3.2 is "the kind of release that customers really crave for, because when they install it they really see the performance go up." Customers simply want "enterprise-grade stability and performance and boring products that just work" from Red Hat. Those in search of exciting new features can look for them in the open source GlusterFS distributed file system on which the Red Hat Gluster Storage product is based, Saha added.
Saha said Red Hat Gluster Storage 3.2's improvements to small-file performance would extend to the Red Hat OpenShift Container Platform, which integrates the Docker container runtime and Kubernetes orchestration engine. OpenShift uses a registry that handles files that are not large in size.
The latest Red Hat Gluster Storage release also strengthens integration with OpenShift through native support for advanced storage services such as geo-replication and in-flight data encryption for applications deployed in containers. The new features are packaged into a refreshed Docker container image that ships with the latest product release, according to Red Hat.
Ross Turkdirector of product marketing, Red Hat
"We're taking a lot of the features that you see in Gluster stand-alone and making it so they operate properly inside containers as well," Turk said.
Customers have a variety of deployment options with containers. Turk noted they could run Gluster inside containers alongside other applications, or they could have Gluster alongside the containers serving storage for them.
Brinker International Inc. runs Gluster in containers as part of the "roll-your-own" implementation of Kubernetes and Docker it undertook to revamp its Chili's Grill & Bar e-commerce site, according to Nathan Huber, enterprise architect for the Dallas-based restaurant chain. Brinker does not use Red Hat's OpenShift platform, but will consider it in the future because of its automation capabilities, according to Huber.
"They introduced Gluster in OpenShift in a very innovative way in that they're actually using Gluster in the container itself. So the Gluster services run in containers," Huber said. "The advantage that this new platform brings is it allows you to basically carve up persistent volumes to other containers. It automates that process for you."
Alternative to three-way replication
Other capabilities that appeal to Huber include the new arbiter volumes Red Hat introduced to provide a lower-cost and reduced-capacity alternative to three-way replication for ensuring data integrity. Huber said Brinker currently does only one-to-one replication to forgo the extra expense of storing a third copy of data.
But many Gluster users make three copies of each piece of data to ensure data accuracy and consistency. Such three-way replication exacts a price in the form of additional storage hardware, data center space and power.
Red Hat's new arbiter volume would use metadata to resolve any potential inconsistencies between two stored copies of data. Because metadata requires far less storage space than a full copy of the data, users would be able to save on resources and costs.
"We have enough information in the metadata to tell us which copy is correct and which other copy we don't care about. That's the engineering that went into it. It's a tiebreaker kind of a thing," Saha explained.
"Essentially, it's three-way replication – except that one of those replicas doesn't have data; it just has metadata," said Turk. "So it's essentially getting the integrity of three-way replication with the capacity of two-way replication."
But the arbiter-volume approach may not appeal to everyone.
"My initial reaction is a little bit of terror in that having only two copies of data doesn't feel like enough. We have very high failure rates of hardware, and if we only have two copies, we're one failure away from one copy," said Kevin Vigor, a software engineer at Facebook.
Vigor said Facebook currently uses three-way replication with its Gluster implementation. An alternative the company would be more likely to explore to save on storage capacity is erasure coding, he said.
"Obviously we're very conservative because data loss is the worst," Vigor said. "Erasure coding is well-tested technology. We know at least theoretically that we can maintain the same level of reliability and redundancy with less storage consumed."
Red Hat Gluster Storage 3.2 enables quicker self-healing of erasure-coded volumes to improve performance during repair operations. Engineers made the initially single-threaded self-heal procedure for erasure-coded volumes multithreaded and more parallelized to facilitate the performance boost, according to Saha.
Saha credited Facebook's work on replicated volumes in the open source Gluster community as a major contributing factor to the self-healing performance improvements Red Hat engineers made to erasure-coded volumes and to the sharded volumes in use with virtual machine images.
Another enhancement in Red Hat Gluster Storage 3.2 is asynchronous Gluster-specific notifications about operational problems, rather than the generic alerts available in the past, Saha said.
Storage options for Docker containers
Red Hat VP: Ceph versus Gluster
Early Red Hat storage adopters see SAN alternative