Faced with cost constraints, Partners HealthCare Systems put together a clustered network-attached storage (NAS) system with open-source storagesoftware, off-the-shelf hardware and InfiniBand switches that scales to hundreds of terabytes for about one-third the price it would have paid for a commercial NAS cluster.
Boston-based Partners is a non-profit health care system and teaching affiliate of Harvard Medical School. Brent Richter, Partners' corporate manager for enterprise research infrastructure and services, said about two years ago he sought a central storage repository for data generated by about 40 research teams across several hospitals in the Partners system. The data would have to be secure and kept separately. Researchers pay for storage through their grant money, so price was an issue.
"Our central storage needs were not being fulfilled," he said. "We needed relatively good performing – but more importantly low-cost -- storage so these researchers could park data, not necessarily during analysis but after analysis. We needed a better solution than USB drives. We needed something that was secure and had redundancy and was online."
Partners came up with a system people today would call a private cloud storage, but "nobody used that term then," he said. Richter said he had a simple goal: "We wanted as cheap a place to put data as possible, with as much performance as we could get."
Richter said Partners looked at systems from enterprise data storage vendors, but also explored the idea of building its own from commodity hardware and either vendor or open-source software "to glue it all together."
That led Partners to GlusterFS, the open-source clustered NAS software sold by Gluster. Richter set up Gluster on Sun Microsystems Fire X4500 servers and X64 Gateways, and two years later he has around 300 TB of storage for his researchers.
"We were confident of the hardware at least, so if everything blew up, we'd still have disk in that storage," he said. At the same time, using Gluster meant the hardware didn't have to be Sun, so Partners could switch down the road to cheaper or better performing hardware.
"It all came together, we built it in an inexpensive way, and we charge back for it," Richter said. "We probably now have 300 TB of storage sitting on disks that we charge back on, and it's growing at about 50 percent a year."
Scalability was a big issue for Partners. "We knew if we set something up, it would have to scale almost immediately," Richter said. We had a low number of users with large storage requirements."
But about a year after putting together the storage system, Richter noticed performance problems. After making tweaks with Gluster and Sun, he decided the problem was with the switching. Partners originally went with Gigabit Ethernet to keep costs down, "but we started to look at InfiniBand for latency."
Partners installed two 24-port 20 Gbps InfiniBand switches from Mellanox last fall. Richter said with InfiniBand on the back end, Partners experienced roughly two orders of magnitude faster read times. "One user had over 1,000 files, but only took up 100 gigs or so," he said. "Doing that with Ethernet would take about 40 minutes just to list that directory. With InfiniBand, we reduced that to about a minute."
He said Partners chose InfiniBand over 10-Gigabit Ethernet because InfiniBand is a lower latency protocol. "InfiniBand was price competitive and has lower latency than 10-Gig Ethernet," he said.
Richter said the final price tag came to about $1 per gigabyte. One storage vendor he strongly considered was Isilon Systems Inc., whose systems would've cost about $3 per gigabyte. "There are tradeoffs," he admits. "Isilon has better management and was more seamless, but we have the engineering to support this."
Partners does no traditional data backup for the research data, but provides protection by using RAID 6 across multiple nodes for redundancy. Richter also offers his users data mirroring with Gluster at twice the price as they pay without mirroring. Only about 10% of his users opt for mirroring, he said.
Richter said the 50% storage growth is temporary – it will likely grow faster this year. He expects to add at least 60% more capacity over the next year. "Last year we kept up with demand but held down capacity because of the economy," he said.
He said now that Oracle Corp. has closed its acquisition at Sun, he will consider Sun's open source storage products when Partners refreshes its high-performance computing (HPC) architecture run separately from the research storage system. He said InfiniBand is also under consideration for HPC.