RealtyData Corp. recently completed an IT migration from its in-house data center to the Amazon cloud, although storage presented problems that threatened the project.
Naperville, Illinois-based RealtyData Corp. sells property information to mortgage bankers throughout the country. It provides documents such as land surveys and titles. Its storage consists of more than 800 million small files, which had been on three aging EMC Clariion storage area network (SAN) arrays.
Storage was the most difficult part of the year-long migration to the cloud, according to Craig Loop, RealtyData's director of technology.
The goal for storage was to save money by moving images to the cloud from on-premises SANs while maintaining performance that was close to what RealtyData's customers were used to. But that wasn't possible with Amazon Web Services (AWS) alone.
Moving nearly a billion files to the cloud was tougher than anticipated due to limitations of Amazon Elastic Block Storage (EBS) and Simple Storage Service (S3). EBS has a 1 terabyte (TB) per volume size limit that would not scale to RealtyData's needs, and Amazon S3 object storage could not meet its I/O-intensive block-level storage performance requirements.
"We were able to handle the databases and servers," Loop said. "Where we ran into a problem was the image data. We had upwards of 800 million image files that needed migrating. We imported all of them to Amazon and put them on S3, but our application had trouble with the latency on S3."
Amazon suggested RealtyData work with RightBrain Networks, a cloud consultancy and service provider. RightBrain recommended startup Zadara Storage's Virtual Private Storage Array (VPSA) based on OpenStack technology. Zadara provides customers with private drives inside AWS storage that are not shared. VPSA connects to the cloud through AWS Direct Connect high-speed fiber connections.
RealtyData implemented Zadara Storage approximately six months ago, and dismantled its data center in May.
"[It] saved the whole project," Loop said of VPSA. "Zadara filled the void that Amazon has with its storage limits on EBS and the storage latency on S3. This wouldn't have been possible with Amazon's current technology. We weren't able to make anything else work with our application to give us the low latency and speeds we were used to with our EMC SANs in the data center."
Loop said RealtyData has had 100% uptime since the move. Perhaps more surprisingly, he said he hasn't received any complaints about performance from customers.
"I expected some kinds of problems and complaints that it isn't working right or is too slow," Loop said. "The feedback has been that everything has been faster and more responsive. The image retrieval is faster on Zadara."
Loop estimated that moving to the cloud cost about half as much as buying a new on-premises SAN to replace the six-year-old Clariions. RealtyData now pays for what it uses through monthly subscriptions to Amazon and Zadara. Zadara pricing is based on the amount of storage consumed and performance, which depends on the type of drives used.
"It's a lot cheaper to scale," Loop said of using the cloud for storage. "It's different than with EMC when you're paying for storage and support up front. You're always playing a guessing game when trying to buy storage and servers, trying to figure out what you need for the future as well as now. If you guess too much storage, you're wasting money. If you guess not enough, you have to go back and spend more money and time trying to expand the array."
Loop said RealtyData's file library is rapidly growing, and the company never deletes anything. It uses about 40 TB of cloud storage, and Loop said the firm adds a few terabytes of capacity a year.
To make Zadara work, RealtyData had to change the way it organized data. With on-premises storage, it kept all files in one repository. That didn't work after moving to the cloud, so now the firm breaks its data into storage vaults holding around 2 TB to 4 TB each.
"We're only accessing 2 TB or 4 TB at a time instead of putting 40 TB or 50 TB in one pool," Loop said. "That makes it easier to access [the pools] and it keeps the system from slowing down as we grow. We don't have performance hits because of the way it's laid out. In the beginning, having 800 million files caused a huge problem with our system indexing and the metadata trying to handle that."
Read AWS' response to cloud performance concerns
How one business saved money with AWS cloud storage
Analytics service scales using Zadara Storage