Tomasz Zajda - Fotolia
CMA Consulting Services embraced all-flash storage early as one of the first customers for EMC's XtremIO, using it to store large, complex Oracle Real Application Clusters, or RAC. Now, CMA is an early customer for EMC's DSSD D5 rack-scale flash system, which it claims takes the speed to another level.
EMC launched its DSSD D5 in February, nearly two years after acquiring startup DSSD. CMA purchased a fully loaded rack, with 144 TB of raw flash capacity that chief technical architect Brian Dougherty plans to place into production for his most demanding RAC. He said he expects a second D5 to arrive in July.
CMA, based in Latham, N.Y., is an application service provider for Medicare payment processing. With its customers processing up to millions of patient claims in a day, fast data analytics is crucial to CMA.
CMA first deployed XtremIO in 2013 for two mirrored Oracle RAC. But Dougherty said he is moving one of those clusters to DSSD D5 systems. When the second D5 arrives, he said he intends to put both of them in production for an eight-node Oracle 12c RAC.
"We're using DSSD for brute performance for our highest-speed analytics," he said. "No copy service or other value-added services, just high performance. We just need very high throughput and very low latency, and that's what we're using DSSD for."
EMC DSSD D5 includes server-attached custom flash modules and is sold in either a fully loaded or half-loaded 5U rack. The D5 uses a dual-path, active-active controller architecture, and EMC claims it can deliver more than 10 million IOPS and 100 GB per second of bandwidth from one rack. The system separates control and data paths, dedicating server CPUs to move data directly from the application to its flash modules via NVMe over PCI Express.
The system isn't for every organization, with a list price tag starting at $1 million. But it fits a company with CMA's profile.
Brian Doughertychief technical architect, CMA
CMA has a large, complex Oracle RAC setup for its hosted data warehouse business and its MicroTerabyte product that packages RAC for customers. Dougherty said CMA can have more than 1,000 users concurrently hitting its databases.
Before going to all-flash, CMA used eight-engine EMC VMAX 40K hybrid arrays with flash and 15,000 RPM hard disk drives for the Oracle RAC. Dougherty said he also tried moving some data to a Hewlett Packard Enterprise (HPE) Vertica database designed to work without high-end storage.
DSSD D5 reduces latency issues
"We weren't servicing that cluster fast enough," Dougherty said. "Our largest tables are between 10 billion and 15 billion rows of claims history, and thousands of users drive analytics against that. We created a third copy of these two Oracle RAC and moved it to a Vertica database, which is a column-compressed oriented database."
But Dougherty said the high demand still left them struggling to keep up. "Ninety-five percent of the queries come back in less than five minutes. But we get thousands of queries a day. We still have 50 to 60 queries that take over an hour to run, and those are the queries we're addressing with DSSD."
He said DSSD D5 provides 900 microseconds of latency, which is faster than XtremIO's latency of slightly less than 1 millisecond.
Dougherty said it takes 20 minutes to run a full scan on CMA's largest database table, which is 11 TB and "gets hit all day long." The first D5 reduced that to 3 minutes, 49 seconds in tests, and Dougherty said he expects to get it below two minutes with the second D5.
"That will significantly change the game for us," he said. "We have a lot of summary tables, derivative tables and parallelized views that we create and partition to make up for the bandwidth we don't have. We're looking to radically simplify our physical database design."
Dougherty said he tested his D5 connected to two 2U HPE DL380 Gen9 servers and Mellanox Quadruple Data Rate InfiniBand adapters.
Dougherty said his D5 is simple to manage, mainly because it's a brute-performance box without storage services. There is little configuration work. The DSSD D5 has three interface options for accessing data: a block driver for legacy applications, a Flood Direct Memory API for customized apps and a DSSD plug-in for Hadoop Distributed File System (HDFS).
"We create five block devices for our Oracle database via the Flood command, and we can create them very large," he said. "We create 10 TB block devices, and then we're ready to go. No separate multipathing, either. We use udev [Linux device manager], but multipathing is built into the Flood software. There's not a lot you do on the server, either, other than create block devices and set up udev. Then, you have one block device configuration file where you tell each of the nodes what volume you're dealing with."
Further use cases and possible improvements
Dougherty said he is considering testing the DSSD systems with a Hadoop application, instead of trying EMC Isilon NAS boxes with an HDFS plug-in.
Dougherty said he would like EMC to add nondisruptive upgrades for the DSSD platform, as well as replication between boxes, but neither of those features are critical for CMA now.
"I have two other copies of the database we're running it on, but at some point, a nondisruption upgrade would be good as we move into production," he said. "I'd also like to see more granular monitoring on the D5 for deeper insights into what's going on in the box. But this is a high-performance hard-wired box. There are not a lot of value-added services."
What is the best option for high-performance storage?
Rise in flash storage results in shifting expectations
EMC's Jeremy Burton gets candid about DSSD flash