RedHat Inc. today released a new 1.2 version of its Inktank Ceph Enterprise software, featuring erasure coding, cache tiering and updated tools to manage and monitor the distributed object storage cluster.
The release marks the first product update since Red Hat acquired Inktank Storage Inc. in May for about $175 million in cash. Targeted at cloud, backups and archives, Inktank Ceph Enterprise (ICE) combines open source Ceph software for object and block storage, Calamari monitoring and management tools, and product support services.
Red Hat's ICE 1.2 software-defined storage brings the commercially supported product in line with the latest Firefly release of open source Ceph storage software, and two key new features -- erasure coding and cache tiering -- are already generating interest.
Inktank customer CERN, the Geneva-based European Organization for Nuclear Research, views erasure coding as an important capability, according to Dan van der Ster, a storage engineer and Ceph service manager in CERN's IT department.
"It's the only way you can build a durable yet affordable cluster at the multi-petabyte scale," van der Ster said via email. "But, the complicated number-crunching involved adds a rather significant performance penalty. So, we're excited to test Ceph's new pool-tiering feature -- now available in the Firefly release -- to see if the combined solution [of erasure coding plus tiering] is workable for our block and physics data user cases."
Red Hat recommended customers who want erasure coding and fast performance to consider the new cache-tiering feature to keep the hottest data on high-performance media and cold data on lower-performance media, according to Ross Turk, the company's director of Ceph marketing and community.
"This allows you to take one pool inside Ceph and turn it into a read cache or a write-back cache for another pool," Turk said. "If you have a backing pool that's erasure-coded that's very dense -- not particularly fast, but very cost-effective -- and you put a cache pool of [solid-state drives] SSDs in front of it, that will make sure your read and write access is quick for the hottest data."
Ceph's default erasure coding library is Jerasure, and administrators can specify the data chunk and coding chunk parameters when they create an erasure-coded back end. ICE's erasure coding default setting is 2+1, meaning the system breaks the data into two pieces, creates an additional chunk and stores them across three object storage devices.
However, Turk said the company does not expect most people to use the default setting in production. He said a 12+2 erasure-coded pool might be a more common option that provides good data distribution, tolerates the failure of two nodes, and affords low storage overhead.
Turk noted the potential economic benefits of erasure coding, since ICE pricing is based on capacity. He said customers could store more data in the same raw capacity, although the tradeoff is that the system would need more time and processing power to do the calculations to recover the data than with replicated copies.
"When you're designing a storage cluster, you always have to overprovision. If I want to put 100 GB into a storage cluster, generally with Ceph, I'd need to buy 300 GB worth of hard drives," Turk said. Erasure coding "reduces that ratio," he added.
Ceph's erasure coding happens at the pool level, not at the cluster level, so customers could potentially have erasure-coded pools alongside replicated pools in the same cluster, Turk said.
Ashish Nadkarni, a research director in IDC's storage systems and software practice, said Red Hat ultimately may need to offer erasure coding at a more granular level, but the new ICE 1.2 features throw Red Hat into the race with other object storage vendors.
"It's not a major, major release, but it's on track for getting them more in line with other object players out there," Nadkarni said. "Now that Red Hat has acquired them, they're probably going to accelerate their launch cycle. They're going to make sure the features that are more enterprise-focused and more OpenStack-focused get prioritized over other ones."
ICE 1.2's other main new feature is enhanced Calamari management capabilities. The Web-based software includes a dashboard to give the user a status check on the cluster; per disk/pool performance statistics measuring IOPS over time; a diagnostics workbench; and tools to monitor disk usage and manage and adjust cluster, pool, device and OSD settings.
Inktank's Calamari management tools initially were proprietary, but Red Hat open sourced Calamari weeks after the acquisition, according to Turk.
One of the missing pieces from ICE is support for file-based storage. Turk said the Ceph file system is not production-ready. Other features due in future releases include performance improvements, LDAP and Kerberos integration and support for iSCSI, VMware and Hyper-V, he said.
Supported host operating systems for ICE 1.2 are Red Hat Enterprise Linux (RHEL) 6.5 and 7, Ubuntu 12.04 and 14.04, and CentOS 6.5. Supported clients for connecting to ICE are RHEL OpenStack Platform 4 and 5, RHEL 7 Kernel Rados Block Device (RBD), Ubuntu OpenStack and Mirantis OpenStack. One of the most common uses for Ceph storage software is as a back end for OpenStack cloud storage.
ICE's capacity-based, tiered pricing model aims for one cent per GB per month at petabyte scale and beyond, but the cost is a bit higher for smaller installations, Turk said.
Tips for working with OpenStack cloud storage