Scanrail - Fotolia
The newly released Datrium DVX "server-powered storage system" is drawing interest from beta customers looking for high performance for VMware virtual machines and lower cost per gigabyte for flash.
The Datrium DVX system, which became generally available last week, can boost performance through customer-supplied flash cache in servers and scale capacity separately in Datrium-supplied NetShelf back-end storage appliances. Users manage the storage from end to end the same way they manage VMs -- through VMware vSphere.
"A vSphere operator [who] doesn't know anything about SANs should be able to manage the whole thing," said Datrium CEO and founder Brian Biles, who was also a Data Domain founder. "It's a giant data store in VM terminology. You don't see LUNs or volumes broken out."
The Datrium Distributed Execution Shared Logs (DiESL) file system runs on the server hypervisors and the associated DVX NetShelf disk-based appliance, but the host server handles the RAID processing and data services. DiESL provides storage capabilities, such as always-on deduplication and compression, and instant zero-copy clones at the VM level.
Datrium DVX caches data in the server-based solid-state drives (SSDs) and synchronously writes the compressed data to mirrored nonvolatile RAM in the network-attached NetShelf box. The system persistently stores the data on the appliance's nearline SAS hard disk drives.
Compression takes place on the server before any writes are made, and the data remains compressed across all components of the DVX system. Deduplication works differently, with separate dedupe domains on the server and on the NetShelf storage.
On the server, inline deduplication occurs before the data is written to the local cache. Each server's cache constitutes the local dedupe domain. On the NetShelf storage, the system's post-write space reclamation process cleans out duplicate data segments among the writes from all of the host servers, typically within a few hours, according to Biles.
The per-host raw flash cache capacity can range from 800 GB to 8 TB. If the host-based cache fills up, the DVX system removes the least frequently accessed data from the SSDs. Data reads come from the NetShelf in the event of a cache miss.
But Biles said the cache could be "massive," especially after inline deduplication and compression. He said a 1 TB SSD cache offers "way, way more than most VM servers are going to have in total use on data," and the Datrium DVX system allows eight SSDs per host. Because the SSDs are used only for caching, rather than persistent storage, Biles added that there's no need to use expensive flash drives.
"In an array, if you need to upgrade the speed, you have to upgrade the most expensive parts," Biles said, referring to the compute and flash. "In this model, those are commodity server components, and our software can just take advantage of whatever the customer provides."
Datrium DVX lists at $125,000 for a 2U NetShelf D12X4 appliance with 48 TB of raw capacity -- 30 TB usable -- and an estimated 60 TB to 180 TB of effective capacity, with deduplication and compression factored in. The price also includes unlimited licenses for DiESL's Hyperdriver software on vSphere hosts. DiESL 1.0 software supports up to 32 host servers and two to 10 cores, and as many as eight SSDs per host.
Customers are responsible for buying and supporting their own x86 host servers, SSDs and VMware software. Datrium provides a product-compatibility list.
Early use cases include VDI, data warehousing
Joshua Rabe, a systems architect at North Bend Medical Center, based in Coos Bay, Ore., said the organization's 20 TB SAN had about 2.5 TB of cache for the entire infrastructure. With Datrium, the medical center uses 2 TB worth of Samsung 850 PRO SSDs on each of the eight host servers in its virtual desktop infrastructure (VDI) implementation, which has about 450 desktops.
Rabe said he's able to go online to shop for the best price on flash drives and just slot them into the server once they arrive. He simply logs in to vCenter, selects the Datrium plug-in and tells the system to use the newly added SSDs.
"Since we're able now to expand our cache exponentially, all of our reads are coming out of cache," Rabe said. "Before, we were seeing about a 90% cache hit rate during normal operations, and when you had to do a restore or push out a new application, those numbers would go way down."
The Santa Clara County Office of Education in San Jose, Calif., picked up SSDs at a Fry's Electronics store and used them to test Datrium DVX with its data warehouse. The performance improvement was so noticeable over the existing disk-based SAN that the county expects to purchase the newly released DVX, according to Phil Benfield, director of the education office's information systems center.
"I would guess the price is probably a tenth of what it would be if we bought vendor-populated arrays with SSDs," Benfield said. "That's a big, big deal for us, because our budget is fixed pretty early on. We're public sector. When we need to ramp up, it's a long cycle to get it into the budget. So, the ability to add storage performance and capacity incrementally, and within our budget, is important for us."
Performance varies depending on the server CPU and SSDs that the customer chooses. Datrium claims total bandwidth can hit 30,000 IOPS per host -- or 1 million IOPS for a cluster of 32 hosts -- using two-socket servers with current Intel Haswell processors.
Viktor Tadijanovic, CTO at Abacus Group LLC in New York, which provides IT infrastructure and services for financial companies, said his company's DVX tests with a VDI installation produced exceptionally high throughput on the storage and minimal or no latency. Abacus used a 1 TB flash drive in each of two VMware ESX hosts and a single NetShelf with a capacity of 20 to 25 TB.
"As a service provider, we have to watch the bottom line. If we can get all-flash array performance at significantly less cost, obviously, that's very appealing to us. The entry price point for all-flash arrays is much higher than what Datrium is promising," Tadijanovic said.
He said preliminary estimates suggested the Datrium system could be 50% cheaper for his company than an all-flash array. Abacus expects to purchase DVX to offer VDI to clients, according to Tadijanovic.
"The only downside is that this is strictly VMware virtualization storage. If you're on a different virtualization platform or if you have physical hosts, then it's not the best solution," Tadijanovic said. "But in our case, we're mostly on VMware, so it's a good match."
Novel approach, waiting for production results
Mike Matchett, a senior analyst and consultant at Taneja Group in Hopkinton, Mass., said Datrium could serve as an alternative to hyper-converged systems. He said IT organizations opting for hyper-convergence have to rip and replace everything, but Datrium allows users to keep their existing vSphere clusters, and only replace the storage and disk shelves.
"It looks like it will catch on, because they've understood a specific problem and created a definite opportunity for a lot of people to optimize their vSphere clusters," Matchett said.
But Richard Fichera, vice president and principal analyst at Forrester Research, cautioned that it's hard to predict how well Datrium's novel architecture will perform under a wide range of workloads.
"The net impact of consuming CPU cycles on the server to do that storage processing versus on the storage system is an unknown," Fichera said. "I've never seen it in production."
Primer on use of solid-state storage for flash caching
How to determine where to implement flash storage
Overview of server-side flash types for caching, memory