A 10 PB file system that supports a new supercomputer at Oak Ridge National Labs has been clocked at 240 GBps. Buddy Bland, project director for the Leadership Computing Facility at Oak Ridge National Labs, said the DataDirect Networks systems storing the project's data were chosen against an equally speedy competitor, not for their performance, but for their space efficiency.
The supercomputer, called Jaguar, became the second supercomputer to exceed a petaflop (1 quadrillion calculations per second). IBM's Roadrunner, the only other petaflop supercomputer, remains slightly ahead of Jaguar on the list of fastest computers.
The file system that supports it, called Spider, consists of 48 DataDirect Networks S2A9900 arrays configured with 280 1 TB hard disk drives in 24U using 4U, 60-bay disk drive enclosures. They connect to 192 Dell PowerEdge servers running a 10 PB Lustre filesystem with a single namespace.
"We didn't set up the whole huge system; we tested a single server node with one array, and both achieved the performance we needed," Bland said. He declined to name the competitor because of the unusual nature of the testing on an early product. "It was really the density that pushed us toward DataDirect."
While the S2A9900s take up 30 cabinets, the competitor's would have needed twice that space. "They were very, very close, but DataDirect Networks included more disks in the same amount of space on the computer room floor," Bland said.
Another point in DataDirect's favor was the self-healing capabilities it offers within its array. Data Direct's products put some of the heavy-duty processing of data, such as parity calculations, into silicon by way of field-programmable gate arrays.
This capability combined with parallelization means that the arrays can write and read at the same rate. That sets up the ability to calculate two-disk parity on every read, as well as every write, without performance degradation. So theoretically, the system never goes into rebuild mode because it operates in that mode all the time.
But an environment like this one is looking for all the performance it can get, and while Bland said the feature is appealing, "We'd just like it to be even faster." Bland has been frustrated that disk access times haven't kept up with the increases in performance for other technology like the InfiniBand network connecting the S2A9900s at Oak Ridge. The lab uses solid-state storage for some applications, but "the cost is just too high" to consider SSDs instead of spinning media, according to Bland.
So now that it's been built, what projects need this kind of computing horsepower?
Bland said one example is a project for the Intergovernmental Panel on Climate Change, which shared a Nobel Peace Prize with Al Gore last year for environmental work. Oak Ridge performs an assessment of the climate in every state in the U.S. to send to the central project, which aggregates results from around the world to track climate trends.
"Today, it's at the state level. With this new, bigger computer we could get down to the county level, along with county-sized blocks around the world, including the oceans and the poles," he said.
Nuclear fusion reactor research in France might also push the limits of this new supercomputer. According to Bland, "You design a fusion reactor looking at it atom by atom."