Working with OpenStack storage: Tips on Cinder, Swift and the cloud
A comprehensive collection of articles, videos and more, hand-picked by our editors
Open source, software-based OpenStack Block Storage, also known by the code name Cinder, takes a different approach from the traditional block storage products that enterprise IT shops are intimately familiar with.
Ashish Nadkarni, a research director in the storage systems practice at Framingham, Mass.-based International Data Corp., likened OpenStack Block Storage to storage virtualization from vendors such as Hitachi Data Systems and IBM in its ability to provide an abstraction layer to integrate third-party arrays and pool storage resources, but in an open source format.
In this interview, Nadkarni also explained the provisioning model, hardware options and the role OpenStack Block Storage plays as part of the open source OpenStack cloud computing and management platform.
From a technology perspective, what are the main distinctions between OpenStack Block Storage and traditional block storage?
Ashish Nadkarni: Traditional storage has always been designed with integrated stack or unitary delivery mechanisms in mind. When you talk about traditional storage systems from NetApp, EMC or any of these players, everything you need to deliver persistent storage to the compute layer is within the stack or the platform that is provided by the vendor. You don't need to go outside of that vendor's platform to deliver any kind of storage services. Whether it is persistent storage in the form of disks, data resiliency in the form of RAID, data management, snapshots, clones and/or any data mobility functions, any and all functions that you need to get the most out of your storage platform are delivered from within the storage system itself.
In the case of OpenStack Block Storage, it has really been designed in a modular fashion to not just deliver storage using internal resources from a Linux server, but also to integrate external arrays from wherever they are and [to] deliver storage, in a federated fashion, to OpenStack Compute instances only. Today, the only instances that can access OpenStack Block Storage are the OpenStack Compute instances, meaning you need to have OpenStack Compute running on the server if you want to access OpenStack Cinder.
You can almost think about OpenStack Block Storage in the same manner as you would think about in-band Fibre Channel virtualization or storage virtualization products from IBM, Hitachi Data Systems, FalconStor and DataCore. All of these in-band Fibre Channel virtualization solutions function in a northbound-southbound manner. In the northbound manner, they essentially act as a unified presentation layer to the server. Any compute instance that can access storage through Fibre Channel can access this virtualization layer, which is essentially an abstraction layer, and see storage in a sort of a federated fashion. On the southbound side, they essentially pool their storage from one or more persistent storage platforms and create virtualized pools of storage that are then used to provide the northbound data services. So all of the data services are built within the actual virtualization platform itself. However, those data services are also federated across all of the storage platforms that they are drawing their actual persistent storage from.
Can you further compare OpenStack Block Storage with storage virtualization?
Nadkarni: I would almost think of OpenStack Block Storage as the next-generation or the new-generation approach to doing storage virtualization. It's based on an open source, community-based approach and is not tied to any particular protocol, meaning that today it can drop in external storage platforms using a variety of protocols. But, more importantly, it's driven by an API [application programming interface]-standards layer and not so much driven at the protocol layer.
In the older days, storage virtualization solutions from all of the vendors -- irrespective of who they were -- were proprietary, closed platforms, and you absolutely needed to be a part of that vendor's ecosystem to deploy the storage virtualization solution. You could not do it in an open format, whereas OpenStack really is all about doing it in an open source format.
In what ways does the storage provisioning model for OpenStack Block Storage differ from the storage provisioning model for traditional block storage?
Nadkarni: Traditional storage has been designed with a fair bit of pre-provisioning in mind, and for traditional data center-based provisioning, there is a fair bit of planning that goes into the actual provisioning process of the storage system itself. So a lot of the data-provisioning activities are planned ahead of time and are then deployed in a manner that is very structured.
On the other hand, OpenStack Block Storage is designed with cloud scale in mind, so when the request is made for the compute layer, all of the orchestration and plumbing happens behind the scenes in an automated fashion. The storage is delivered almost instantaneously by a sequence of events that happen behind the scenes throughout the OpenStack system or block storage platform itself.
Now the big difference here is that in the traditional storage system, because the platform is a unitary platform and needs to only draw upon resources that belong to itself, the provisioning structures can be pre-defined and pre-populated. In OpenStack, it is all done through a sequence of algorithms that interact with these other storage platforms to make that provisioning work.
What are the hardware options for OpenStack Block Storage?
Nadkarni: OpenStack Block Storage starts off with the most basic type of storage, like simple instances within a Linux server. You could take internal storage within a Linux server and use that for OpenStack Block Storage. But nowadays, a lot of the ecosystem partners of OpenStack -- commercial suppliers like NetApp, Nexenta, EMC, SolidFire and Zadara -- are making their storage platforms fully compatible with OpenStack Cinder. Their platforms can also be used as persistent storage for OpenStack Block Storage.
[Also], the block storage part of Ceph, which is a unified open source-based platform, can also be used for providing OpenStack Block Storage. So Ceph could be an easy alternative to the native block storage capabilities of OpenStack itself.
What's the ultimate vision for OpenStack Block Storage, and how close is that vision to becoming reality?
Nadkarni: The goal for OpenStack is to provide cloud-scale storage in an open source, community fashion and in an economic fashion where there is no vendor lock-in. That's the overarching goal: to create this alternative stack of services to the proprietary closed formats that we are all used to and really usher in an era of DIY -- a do-it-yourself kind of software-defined data center construct for cloud scale, as well as the enterprise.
Can you use OpenStack Block Storage without using other OpenStack services?
Nadkarni: Not today. OpenStack Block Storage only works with OpenStack Compute instances. OpenStack Compute instances are delivered through what is known as OpenStack Compute, or Nova, and unless you have some kind of an ability to access the storage layer directly, you have to use the OpenStack mechanisms to do it.
Could you use OpenStack Compute and OpenStack Block Storage without using any of the other OpenStack services, such as networking and authentication?
Nadkarni: Sure. You don't need to use other services, but the other services are used more often than not because they cater to providing a more complementary and full service. For example, you would use some of their authentication mechanisms to make sure that the block storage is really used by the compute layer in the right way. You would use some of the networking capabilities to bypass some of the physical constructs and such. The services today are optional. However, more and more cloud service providers are leveraging these services to ensure that the service quality is maintained or enhanced.