When Intel originally announced the 3D XPoint architecture based on Optane memory technology, the vendor talked about "1,000x the performance of flash," an end to write endurance problems with "1,000x flash life," "10x the density of DRAM" and a way to bring a new tier of memory into play that was fast enough to effectively extend DRAM while making it persistent.
Reality has set in, to a great extent, and some of these claims have been tempered, at least for the short term. Recent announcements by Intel and its partner Micron have radically reduced the speed expectations to just 4x that of flash. While this is still a major gain, it's a long way from the original claims.
This is because the reality of systems design has hit the technology head on. Called Optane by Intel and QuantX by Micron, 3D XPoint is a phase-change memory (PCM) technology, where cells are switched from high-resistance to low-resistance states electronically. Both products use the same core die for storage, made in the Intel-Micron flash technology company's facility in Utah.
PCM is inherently a byte-addressable storage system using a NOR addressing scheme for cells. Byte addressability is the Holy Grail of storage for the simple reason that it eliminates the traditional file I/O operation that eats up thousands of CPU instructions and removes a minimum 4 KB block, replacing it with a single, direct register-to-memory CPU instruction. If Intel were able to deliver on that approach, "1,000x performance" would be very real.
Core system changes needed
Systems aren't designed for persistent memory to be used the way dynamic RAM (DRAM) is currently. Major changes in the x86 hardware and operating systems, from handling machine check errors in a different way to segregating persistent from nonpersistent memory, are required. Just handling a single bit error in a soft fail way (as we do with SSDs) is tricky in a standard x86 server. Only the Itanium, which is in Intel server CPUs, comes close to that one feature.
Other major changes are needed. These include a compiler that enables persistent memory to be declared, together with link editors that can build that memory into an application. The applications themselves will need a rewrite to eliminate file I/O and to use single instruction and vector operations. These are massive reassessments on how a system is put together.
Such changes are in the pipeline, since the opportunity for boosting server performance is too good to pass up. Still, they won't arrive in 2017, especially given Intel's longer term plans for Optane-based architectures.
The first real Optane memory product is an SSD. It will run at 4x flash, though my guess is that boast will erode in the face of some new drives with millions of IOPS performance. These will arrive by year's end, delivered with a price premium that reduces the excitement factor somewhat.
Micron claims higher performance than Intel, likely because of its different controller designs and driver code. Performance will be a leapfrog game in 2018, but remember that Intel designs hardware, software and the compilers many developers use, so they have a strategic advantage.
Intel's target is not only to realize byte addressability, but to add direct sharing of Optane SSDs, and the follow-on Optane nonvolatile dual in-line memory modules (NVDIMMs), across a server cluster by connecting them to an NVM Express fabric.
This seems like a simple concept. The idea is to use Optane memory to expand the apparent size of the DRAM from, say, 1 TB to 10 TB or 20 TB. Just think what this can do for the number of containers on a server. Caching may be able to mask the slower speed of Optane versus DRAM, while there is a good deal of discussion about compressing the data.
Compression is a way to save space, but it also reduces transfer times and the network bandwidth needed. While applying compression to data being written to Optane is compute-heavy, and may be best suited being handled as a background job, decompressing data is fast, especially with a graphics processing unit or field-programmable gate array assist. Overall, compression would offer a substantial system performance boost.
Life gets complicated when we try to work persistence into the picture. That implies a file I/O view of Optane memory in the short haul until byte addressability kicks in. The $64 billion question is what we can do with that persistence.
First, reboots can be blindingly fast. I led a design team delivering battle system clusters for U.S. submarines. That restart feature alone would be enough reason for them to use NVDIMM, since recovery from a power glitch in a second would literally be a life or death issue.
Databases, including scratchpads like Memcached, could take good advantage of fast persistent memory. With the right NVDIMM architecture, writes could be aggregated, as one vendor of NVDIMMs, Diablo Technologies, is doing, without the risk of data loss on a power fail. This would make the use of internal scratchpad databases a strong aspect of application design.
The same is true of in-memory databases such as Oracle. Optane NVDIMMs would allow much bigger, more effective databases. Imagine, though, if the Optane memory devices were all on a fabric. Then it would be possible to replicate copies for data integrity on multiple appliances, making for a very robust, but lightning fast, system.
Compressed storage for big data
One challenge for handling big data is the network bottlenecks in both WAN and LAN. Having decompression assist the CPU allows for a force multiplier on network bandwidth of 5x or 10x, but it implies a very large memory space in each server to hold copies of compression primitives. That's where Optane's huge capacity is an enabling technology.
We all know high-performance computing (HPC) apps, such as oil and gas analysis, are compute hogs, but they are also heavy storage I/O situations.
I vividly remember talking to a National Labs director who complained that a job might run for an hour, but then need another hour for data download and loading the next data set. Compression might apply to these workloads, too, though data structure may vary the compression ratio all over the map.
Even so, HPC relies very heavily on memory, so an inexpensive memory multiplier could be very useful in many applications.
At first glance, a stateless cloud seems the most unlikely place for Optane memory to be a winner. The low-hanging fruit is likely to be as a memory extender, taking advantage of capacity per DIMM. Cloud service providers also offer instance storage, where Optane could be a game changer with an overall boost to instance performance due to the much faster access.
Looking beyond these use cases, hybrid clouds need a lot of storage for images, especially for virtual desktop infrastructure use cases. Optane matches up to the need for speed that agile containers offer, and we can expect a lot of interest from both cloud service providers and enterprises as a result.
Optane memory fits the hyper-converged infrastructure model very well, even without the byte addressability and direct fabric connection capability. The ability to cluster the persistent space would work well with the initial 4 KB-block-I/O models. My personal view is that NVDIMM and Optane will likely become the primary memory tiers in these systems.
Optane has some serious competition coming. Samsung is actively pursuing an alternative, for example, while 3D flash density will provide a parallelism of access that effectively boosts flash performance into the Optane range. Whatever happens, the consumer is the winner in all of this.
Facebook was already tweaking Optane SSDs
First reveal of 3D XPoint by Intel and Micron
Flash storage insiders keen on 3D XPoint