Object storage is growing in popularity in the data storage market due to its unique ability to scale and store metadata. Environments with a large volume of data or those with big data analytics projects are ripe opportunities as object storage use cases. In this Tech Talk video, Marc Staimer, founder of Dragon Slayer Consulting, outlined some potential object storage use cases and when a company should consider purchasing the technology.
"Just about any organization that has a lot of data should consider object storage," Staimer said. Object storage has the ability to scale extremely high within a single namespace -- theoretically into the zetabytes, according to Staimer; though he admitted that "no one's there yet." The technology also uses erasure coding, a process that improves data durability and requires much less capacity than the methods used in other storage systems, which helps to decrease costs.
One of the main object storage use cases is active archiving, in which rarely accessed data is stored on low-cost storage while remaining accessible by applications and users. This process is used to store data that is not often needed, but is important enough that the ability to access it if necessary is essential.
Business analytics projects, which involve an in-depth exploration of a company's data, were also cited as potential object storage use cases. The ability of object storage to store large amounts of data within a single namespace, and to integrate with programs such as Hadoop, can help improve analysis of big data.
Despite these potential benefits, it's important to note that transferring data to object storage is difficult. Staimer said one of the main problems with implementing object storage is, "How do you get [the data] from point A to point B?" The most efficient way, he said, is to use software that can move data from either NAS or SAN systems into an object storage system, rather than using an object storage gateway. This "type of software is a necessity," Staimer said, as it allows applications to continue accessing data even after it has been moved to the object storage system.
This video is Part 1 of a two-part video series on object-oriented storage. Part 2 explores the issues object storage systems have when dealing with interactive applications.
Transcript - Object storage use cases cover analytics, secondary storage
Hi, my name is Erin Sullivan. I'm the assistant site editor for TechTarget's Storage Media Group. Joining me to discuss object storage is Marc Staimer, founder of Dragon Slayer Consulting. Thank you for joining us today, Marc.
Marc Staimer: My pleasure to be here, Erin.
Object storage has seen a bit of growth in popularity recently. What are some of the big advantages to using an object storage system?
Staimer: The biggest advantage of object storage is its scalability, durability and its low cost per gigabyte. Scalability [is] into the zetabytes, although no one's there yet. Most people are into exabytes or petabytes, but it can definitely scale into very high numbers in a single namespace.
The cost is another big factor because of the way it keeps the data resilient and durable. It requires far less capacity to do so than a typical primary storage system that uses RAID or multi-copy mirroring. It uses erasure coding. Generally speaking, you have scalability, you have durability and you have low cost.
What are some of the common object storage use cases?
Staimer: One of the most common is active archiving. You archive it, but you can analyze it. You can search on it. You can discover on it. The second would be business analytics, like a NoSQL database; Hadoop running with Spark, perhaps; or Hive. You have the ability to analyze unstructured data because of object storage's low cost.
In fact, many object stores have an HDFS interface -- the Hadoop Distributed File System -- so that the data stored on the object storage -- which may have gotten there from NFS, may have gotten there from REST, may have gotten there from SMB -- can be displayed as HDFS data. So the HDFS nodes can connect to it and analyze the data.
Same thing with NoSQL. It can provide that data to a NoSQL database, like Cassandra or MongoDB, etc., and provide the kind of capacity of large amounts of data to analytics. Even a data warehouse can use object storage. And finally, backup. A lot of backup systems today, a lot of backup software, data protection software, can store the data on an object store. And that's three specifically. Or in the cloud through [Amazon] S3 compatibles.
When should an organization consider implementing an object storage system?
Staimer: Just about any organization that has a lot of data should consider object storage. If you look at the value of data in most storage systems, less than 10% is what's considered hot data or data that has high value because it's actively being used all the time.
When data is created, the first 72 hours it's going to be the hottest it's ever going to be. After that, it starts to tail off. After 30 days, it tends to get cool. After 90 days, it tends to be cold. So, you have to look at it and say, "Why am I keeping that data in very costly primary storage? Why don't I move it to secondary storage?" This would be something along the lines of an object store. That's why it makes sense for just about anybody who has a lot of data.
What factors might hold an organization back from implementing object storage?
Staimer: The biggest factor is, how do you move the data from where it is to where you want it? Where you have it today is on a SAN or a NAS system. A lot of unstructured data is on NAS systems. In fact, [unstructured data is] the fastest growing. Object storage can take that data and store it just as adequately for a much lower cost [and] more resilience.
But how do you get it from point A to point B? There are multiple methods. Some people say, "If I put a gateway in front of my object storage, or I use a gateway in my object storage, I can take care of that." But gateways don't vacuum the data out. Gateways are just a target. That means it has to be pushed. It means it has to be manually migrated.
There are other types of software that will actually take the data from a NAS system and move it to object storage, and there's also software that will move it from a block storage system, a SAN system to object storage. That type of software is a necessity.
When looking at that kind of software, you want software that will maintain the chain of ownership. So the application in which it was created can connect to it in the object storage so you don't have to say, "I have to find it, bring it back to that storage and then it can connect."
Staimer: You're welcome.
That wraps up today's tech talk on object storage with Marc Staimer, thank you for joining us.