The delivery of storage as a service has advanced considerably over the years. Today, public clouds like Amazon...
Web Services and Microsoft Azure offer object storage on demand for internal and external connectivity, as well as block and file storage for internal assignment to compute instances. This operational flexibility would be highly attractive in the data center, offering convenience and agility well beyond traditional methods of storage deployment.
How do you go about building your own private storage cloud? To start, let's take a step back and review what cloud computing really means. The standard definition of cloud covers the following characteristics: elasticity to grow and shrink consumed resources; delivery as a service, a standard set of service offerings defined in abstract terms rather than on physical hardware; multi-tenancy to support multiple clients; on-demand access for requesting resources with little or no manual intervention; and reporting and billing, with detailed reports for charging based on consumption over time.
A private storage cloud should reflect these same features. Business users, the customer in this case, should be able to request storage capacity without having to worry about how it's delivered. So service catalogs, which have been in use for many years and may be focused on physical technology (like HDD speed or HDD/flash), need refreshing to become more focused on service metrics. This means using terms like I/O density (IOPS per terabyte of storage), latency, throughput, data availability and resiliency.
Multi-tenancy refers to both security and performance isolation. Security ensures data isn't visible between private storage cloud users, while performance features like quality of service (QoS) make certain each user receives a consistent service level, irrespective of system load. On-demand access guarantees customer requests for resources with minimal intervention from IT -- storage administrators, in particular. Reporting has to cope with a more granular measurement of storage utilization, including the capability to report on individual teams or business areas.
The first item on our private storage cloud to-do list, elasticity, comes in two scenarios: first, from the customer's ability to expand and contract usage on demand and, second, for system administrators to be able to deploy more infrastructure as demand warrants. While some worry deployed hardware won't be used if end users can easily give storage back, this rarely happens.
The challenge here is to continue meeting demand by deploying new hardware into the data center while managing the technology refresh cycle without impacting application availability. Achieving just-in-time deployment of new hardware is both an art and a science for most IT departments, most of which -- unlike Amazon and Microsoft -- have limited amounts of cash and manpower at their disposal. They must compromise in their inability to provide infinite resources by predicting when demand justifies new hardware acquisition.
This is where the art comes in. Demand forecasting requires engagement with lines of business to plan potential future projects and their storage needs. If IT can gain insight into likely future storage resource requirements, these demands can be more easily planned for -- especially if they are in noncore products like object or high-performance storage.
The science comes in gleaning adequate information on storage growth. Many IT environments use thin provisioning, which means physical storage capacity will increase over time as data is written to allocated space. And because reservations for planned consumption of storage are rarely fully utilized immediately (for instance, a 1 TB request may only use, say, 50 GB day one and be sized for three-year growth), file systems and object stores will naturally increase in utilization as applications write more data. This makes it essential to have accurate and detailed tools to measure storage consumption over time, preferably daily, while using that data to generate meaningful growth projections.
In addition, determining when to deploy new hardware requires understanding and managing vendor lead, hardware deployment and configuration times. In the enterprise, IT still owns these issues, which are, of course, under the purview of the cloud service provider (CSP) when you purchase public cloud storage.
Choosing a platform
Having the right storage platform in place is key to deploying new hardware with efficiency. Scale-out as a choice over scale-up technology can make new deployments relatively simple, as you simply add hardware to the existing configuration to increase capacity.
Most modern object and block scale-out products perform some level of rebalancing, redistributing data to make use of new capacity and gain the most performance from the hardware. Monolithic scale-up architectures can be harder to manage because of scalability limits, while older legacy storage systems may not naturally load balance to make use of new physical capacity. This means legacy architectures have to be more carefully designed to load balance the distribution of logical resources over physical hardware. Many of these platforms come with tools to move LUNs around within the storage platform, mitigating some of the balancing problems.
Multi-tenancy and QoS become key features to consider when choosing a storage platform for a private storage cloud. Looking at service metrics offered by CSPs, we see that performance is rated on IOPS and throughput, with some mention of I/O latency. These service levels are deemed available to users whether the CSP is running fully loaded or not, not traditionally true of legacy storage. QoS, therefore, becomes really important, either as a tool to ensure end users get the performance they require or to limit performance to the level they have paid for with new systems.
In recent years, the storage appliance world has seen a minor management evolution. You traditionally managed storage manually through GUIs and some command-line interface (CLI) interaction, with "commit" phases to make changes. CLIs enabled the storage administrator to script the provisioning and decommissioning process, allowing a degree of automation. However, creating the scripts was a time-consuming process. Over the years, vendors moved to implement APIs that make storage programmable -- to set configurations via authorized API calls. Configuration data can also be easily extracted now, with some storage platforms producing very detailed metrics.
Application programming interfaces have changed the way enterprise storage is managed. In the future, APIs will drive automation and remove most manual intervention from storage provisioning, making private cloud storage more practical to a wider number of enterprises.
APIs also enable automation, taking the "human" out of the process of provisioning storage. Now, storage can be mapped to hosts through one or two API calls. Some platforms implement APIs natively, whereas some have built API wrappers around existing API tools. The crucial requirement here is to ensure APIs, CLIs and the GUIs operate harmoniously rather than stepping over one another.
The final piece of the puzzle to deliver private cloud storage is executing some kind of workflow process. User requests have to be serviced in a way that allows a request to be validated and then implemented. Public clouds achieve this validation process through users providing a credit card or other payment method for billing. After that, services are configured through a web portal or API. In the enterprise, the traditional process for requesting storage has been to have internal processes that manage requests manually, provisioning storage to hosts based on details in service tickets. The owner of the ticket takes responsibility to ensure the line of business is allowed to "purchase" the storage being ordered and then takes care of fulfillment.
Pay as you go
You can purchase public cloud resources with a credit card, billed in arrears. This change in workflow means many organizations will need to look at implementing billing and chargeback when deploying internal cloud storage.
In a private cloud, the aim is to make this process as automated as possible. Tools such as EMC's ViPR, for instance, allow you to build workflow processes around storage automation. Hitachi Data Systems offers Hitachi Automation Director to build workflows around storage and other resource provisioning.
Many organizations will need to consider the changes in billing that a private storage cloud can introduce. If billing and chargeback are not implemented, then there is no issue because the IT department will continue to take the hit on the cost of delivering the service -- and may continue to charge per project. However, if new resources have to be paid for, then some changes to financial practices -- which may currently include IT paying directly for hardware -- may need to be introduced to allow service-based billing of business units to cover costs.
Looking wider afield than the storage team, you can build storage automation into private cloud frameworks such as OpenStack to take the effort out of provisioning. Initial OpenStack deployments had no persistent storage capability, so a number of projects were instantiated to manage the integration of external storage arrays. The resulting Cinder project handles block storage and automates the mapping of LUNs to OpenStack instances, while Manila provides the integration of file system data and Swift offers an API for object storage.
The wider stack
Cloud storage, be it internal or public, forms part of a wider infrastructure stack. This means integrating with platforms such as OpenStack or vCloud Director.
Storage vendors, meanwhile, can write plug-ins that enable the OpenStack framework to provision and map storage LUNs on demand. And many hardware and software companies already support all of the OpenStack storage APIs. A Cinder support matrix lists supported vendor features in each release of the OpenStack platform.
Public cloud integration
Moving forward, the world won't exclusively be public or private, but a hybrid of the two. As a result, there will be requirements to move data and applications between public and private infrastructure, with the latter offering additional data protection (backup) and increased availability. You can also use public cloud storage for workload bursting and archiving.
Products for moving applications and data between on-premises and public cloud locations are starting to come to market. Object storage vendors such as Cloudian (HyperStore) and Hitachi Data Systems (Hitachi Content Platform) provide the ability to archive on-premises data into the cloud, while crucially retaining the ability to search across all content as if it were in a single view.
For data protection, Druva and Zerto both offer products that allow you to back up and restore on-premises virtual machines (VMs) in the public cloud. The conversion of the VM image, and injection of additional drivers, is handled in software as part of the backup and migration process.
In platforms running server virtualization, storage is infrequently mapped to physical hosts. Most work to create virtual machine-instance storage is handled by the hypervisor management software. VMware provides automation through vRealize Automation and vCloud Director, while Microsoft offers System Center 2016.
Velostrata goes one step further by allowing the booting of VMs in the public cloud for cloud bursting. This could be used to run an application on a VM with greater resource capacity than is available on site or move workloads to the public cloud to cope with increased demand. Once the surge in demand has passed, the VMs can be returned back on site.
In the meantime, virtualization vendors are starting to partner with cloud vendors to facilitate the migration of applications into the public cloud. For example, VMware recently announced VMware Cloud on Amazon Web Services as well as a partnership with IBM. It also introduced Cross-Cloud Architecture as a way to manage multiple cloud deployments. Microsoft Azure Stack (in technical preview as of this writing) enables the same functionality of Azure in the cloud to be run in a private data center and to link to the public Azure.
It's clear that on-site and public cloud implementations differ today, mainly in the degree of automation in place to fully exploit private cloud storage. Workflow, as one of the most critical pieces, is -- perhaps -- still immature and needs further work on the private cloud front.
Part of the challenge here is changing the behavior of internal business teams. This is something that the public cloud has helped to promote and should be adopted as a delivery model for internal resources, too.
Unearth the layers of cloud ⇉
This article is part of a series that breaks down the different technologies that underpin cloud-based infrastructure. Navigate here to see the other articles.
Internal cloud storage infrastructure choices
How private storage clouds can avoid cloud washing
Define and deploy hybrid cloud storage