Database data reduction specialist RainStor Inc. set its sights on large archives with the latest version of its inline data compression software, adding Archive Application for Hadoop 2.0 for enterprises looking to shift storage from enterprise data warehouses to Apache Hadoop 2 clusters.
Archive Application is included in RainStor 6, which highly compresses multi-structured data files and stores their deduplicated values in main memory. Customers can run reports against compressed data without needing to rehydrate it, thus speeding performance.
"That translates to a huge fixed savings on per-terabyte [TB] costs," said Mark Cusack, RainStor's chief architect.
RainStor's system can be configured for NAS systems or SANs and includes a mechanism for migrating data off tape. Archive Application targets companies that need petabytes (PB) of storage for compliance or historical analysis, particularly in regulated industries.
The new release builds on RainStor's collaboration with EMC Corp. that started in 2013. RainStor sells its software bundled with EMC's Isilon Scale-Out Storage Solution for Hadoop, a NAS big data platform for systems running the Hadoop Distributed File System.
Dell has also been a RainStor partner and is also a customer. It was an early adopter of Archive Application, implementing it in 2013 using a Cloudera Hadoop framework. Attila Finta, a chief data warehouse architect with Dell Enterprise Business Intelligence, said the company has offloaded 60 TB of financial and manufacturing data from an enterprise data warehouse while deploying RainStor on a five-node Hadoop cluster running commodity Dell R720 servers. That enabled Dell to compress the datasets up to 40 times, leading to an 80% reduction in per-TB storage costs, Finta said.
Archive Application for Hadoop provides storage users with new big data capabilities, said John Myers, a research director of business intelligence at Enterprise Management Associates Inc. in Boulder, Colorado. Myers co-wrote a 2013 report that found 16% of enterprises are using Hadoop as part of big data projects.
"What RainStor is doing is taking advantage of the economies of scale in the Hadoop infrastructure to make the most of what companies are moving toward: low-cost alternatives to storage layers for data warehousing and big data repositories," Myers said.
"You could replicate their main storage solution to a certain extent using other solutions, but RainStor has packaged security, SQL and performance [monitoring] better than other vendors have to date," he explained.