psdesign1 - Fotolia
La Jolla Institute for Immunology updated its storage system to increase speed and keep up with rapid capacity growth in early 2020, and the timing was perfect.
The California independent research institute faced a spike in data growth due to new state-of-the-art Titan Krios microscopes installed in 2019. It also needed a speed boost for servers with GPUs and a custom database. Then, just after rolling in a new system with Excelero NVMesh software and Kioxia NVMe solid-state drives (SSDs), the pandemic hit and La Jolla Institute (LJI) plunged into COVID-19 research that increased both its speed and capacity requirements.
LJI had about 850 million files and 5.5 petabytes of raw capacity in early 2020, according to Michael Scarpelli, senior director of information technology at the Institute. Scarpelli said LJI had "enterprise-grade needs" without an enterprise-grade budget, and its requirements taxed its JBOF (just a bunch of flash) no-frills system.
"Performance was the big thing for us," Scarpelli said. "We just painted storage with one brush for so long. We've always been focused on volumes of files. But in the last few years, it's become much more important that we have performance to do scratch space for sequencing analysis and things like that."
LJI started looking for a new storage system in 2019, and Scarpelli ordered a custom system from Advanced HPC in March, just after the first big COVID-19 wave in the U.S. The system combined Excelero NVMesh with 60 TB of Kioxia SSDs.
"Historically, we just used a bunch of dumb disks, nothing fancy. Having purpose-built performance flash storage is the white whale for us," Scarpelli said. "I always loved the sound of it, but it was out of reach. That was because of either vendor lock-in, or the software was super expensive. It was licensed by capacity, and capacity was too much. Excelero got us around that because it's agnostic."
Scarpelli said Excelero NVMesh software and Kioxia SSDs allowed LJI to get the performance it needs without relying on a high-priced enterprise flash array from a major vendor. NVMesh shares NVMe SSDs across devices, providing fast access and low latency to any connected servers. Excelero customers can plug any server into the network.
Michael ScarpelliSenior director of information technology, La Jolla Institute
"I like that Excelero is hardware-agnostic, because so are we," Scarpelli said. "We tend to be mercenary with what we're going to go with. Because we're always looking for how can we stretch our dollars, because we have what I would consider enterprise-grade needs, but we do not have enterprise-grade budgets. So the fact that Excelero can work with whatever storage we put in front of it, rather than it works with just one vendor, is fantastic for us."
LJI's benchmarks show the fast tier sends around 20 million IOPs to the compute nodes through the BeeGFS POSIX clustered file system, about a 10-times improvement over LJI's old system. Its asynchronous journaled database writes jumped from around 131,000 per second on the old system to 1.1 million with NVMesh and LJI's new Arcitecta MediaFlux data management software.
Scarpelli said that speed bump is important with more LJI employees working at home. Remote workers tend to get impatient with slow file system performance, prompting them to seek their own cloud storage that may not be adequately protected.
"People love to hate on file server connection," Scarpelli said. "And so we're always having that battle -- 'Please don't just use Dropbox or iDrive or something like that, please make sure you use our file server because it has all the data protection, as well as all the features for metadata that we're going to need to give you. It has all the intelligence that you're going to want.'"
La Jolla is using the new storage to run new file system software, part of a proprietary database. "It's the brains behind this huge collection of metadata and the rules that we'll be applying to our data," Scarpelli said. "Our data is growing faster than we expected. And to serve up that data with our new system, we need a very fast connection to the database. We needed to read and write quickly."
Scarpelli said around 350 of LJI's 500 employees are researchers. Many have worked from home, although the COVID-19 researchers continued to work on site in the lab.
COVID-19 research drives much of LJI's rapid data growth. The lab's Coronavirus Taskforce uses two 10-foot Titan Krios Electron Microscopes for sequencing of 3D anatomic resolution images to study biological samples. LJI uses the Krios microscopes to analyze antibody interactions in COVID-19 research. The microscopes can generate terabytes of data per day, and that data is stored forever.
"Historically, we've doubled our data annually," Scarpelli said. "It wasn't a big deal when it was in gigabytes, but now it's a lot more."
Scarpelli said because of that volume of data growth, it's not feasible for JLI to tap into another IT trend -- public cloud storage.
"The cloud never really made a lot of sense for us, partially because so much of what's been done is experimentation in the lab," he said. "There's the possibility for a job to accidentally burst and be 10 times as much as they expected it to be, and then the cost is really high. Also, the infrastructure we've built on site is solid and we haven't seen a lot of reasons to move away from it. We've been able to keep up."