Amazon Web Services Simple Storage Service (Amazon S3) public cloud data storage customers say the low-cost Reduced...
Redundancy Storage (RRS) option launched last week is a step in the right direction, although it's not yet clear if RRS will draw a wider audience of reluctant enterprise storage customers to public cloud storage.
RRS offers Amazon customers the ability to choose fewer "hops" of object replication among Amazon's facilities for a lower cost per gigabyte. With RRS, objects would survive one complete data center failure, but wouldn't be replicated enough times to survive two concurrent data center failures. RRS pricing starts at 10 cents per GB per month for the first 50 TB of storage, and can drop to below 4 cents per GB ($0.037, to be exact) for more than 5 PBs. Regular Amazon S3 pricing starts at 15 cents per GB for up to 50 TB.
San Diego-based digital marketing design agency Digitaria stores hundreds of terabytes of internal as well as client data with Amazon S3. It also uses Amazon's CloudFront content delivery network (CDN) to distribute multimedia files. The Amazon CloudFront process generates what Digitaria chief technology officer (CTO) Chuck Phillips calls "synthetic data," such as thumbnails, to represent video or photo links on a web page.
"That data isn't sensitive or important because we can regenerate it — that's a perfect candidate for RRS," Phillips said.
Phillips said Digitaria is also considering offering a recycle bin or trash can for clients so they can recover deleted content if necessary. Digitaria's digital distribution product accounts for approximately 20% of the company's overall storage, while 45% of the digital distribution product comprises that synthetic data. So for a little less than 10% of the company's overall data, Phillips estimates, RRS will represent a 33% savings.
"Every little bit helps," Phillips said. "Some might not consider it significant, but we're very cost conscious, which is why we've invested in S3. And it takes virtually no effort to implement."
Online marketing technology development firm Razorfish also stores hundreds of terabytes (though less than 1 PB) with Amazon S3, and also offers S3 capacity to clients as part of its services. Razorfish global CTO Raymond Velez said he's looking to use RRS for test/development data.
"Collective targeting and personalization requires massive data engines on top of Amazon's Elastic Map Reduce [EMR], and we tend to need an interim store, which doesn't need nine nines of availability," he said.
Boston-based VAR NSK Inc. has been using Amazon S3 through TwinStrata's on-premise gateway, which will need to be updated to take advantage of RRS on the back end.
"[Amazon is] certainly heading in the right direction," NSK senior helpdesk associate and virtual infrastructure manager Alex Straffin wrote in an email to SearchStorage.com. "The majority of our clients don't need such high levels of redundancy with their data. Our clients tend to care much more about their bottom line. A 30% reduction in monthly costs is very attractive."
Amazon S3 concerns and wish list items
While interest in RRS is high, Straffin is watching to see how the execution goes. "My concerns about the service lie primarily with Amazon's ability to seamlessly migrate data from one service level to another," he wrote. "I am under the assumption that they will have measures in place for upgrading service if subscriber needs change, but presently the specifics have not been made available."
Digitaria's Phillips added that there are some items on his wish list he'd like to see added or changed about Amazon S3 in future releases: a 5 GB file size limitation per object, better "bucket" management, mountable virtual disks for Windows and Linux, speedier cache updates in CloudFront, and faster availability of data during distributed writes.
Phillips said he's aware of companies such as Rackspace (with its Jungle Disk) offering some of those features, including mountable virtual disks with higher file size limits and a storage management GUI for the otherwise API-driven Amazon S3 service, but he'd like to see Amazon offer those features natively.
"Commercial solutions do exist, but we're not sure they're production ready — we want an Amazon-production-ready, mountable S3 bucket," he said.
At the least, he said, he'd like expanded management features for buckets, such as the ability to rename buckets or easily duplicate them. Phillips said he'd also be willing to sacrifice some consistency among distributed writes across the Amazon network for faster availability of the data. "Right now a write is not considered durable until it's replicated across all data centers," he said.
Razorfish's Velez agreed that Amazon S3 management could use some polishing. "It's still fairly esoteric and complex to manage both EMR and S3 together, especially in creating estimates for our clients — it still involves fairly complex cost models," he said.
Concerns about moving data to the public cloud remain
While RRS is obviously a hit with Amazon's current customers, market research reports released earlier this year revealed little interest in deploying cloud data storage at most traditional enterprises. Many enterprise data storage administrators and experts cite security, performance, reliability and availability concerns about moving data to the public cloud.
Digitaria's Phillips said he still sees fear of the unknown among his customers when it comes to cloud storage, but it isn't nearly as strong as it used to be.
"We have the security conversation with customers on a regular basis, but not as often as we did in the early days," he said, recalling one customer meeting that included more than 30 security professionals from his company and Amazon to assure the customer that data would be safe in the cloud.
"Having their data on someone else's server can still be a tough one to get over," Phillips said, but added that more customers are approaching Digitaria with questions about the cloud rather than the other way around.