Getty Images

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

Explore storage orchestration services, benefits, challenges

Orchestration and automation can aid storage management. Here, dive into technical details of orchestration, including containers and Kubernetes, and how it can help.

Rather than musical analogies, which are hard to resist when writing about orchestration, I want to start with a quote from Mary Shelley's magnum opus, Frankenstein, the story of the animation of a collection of lifeless human parts. I suspect she summed up the feelings of most users of orchestration tools when she wrote:

It was on a dreary night of November, that I beheld the accomplishment of my toils. With an anxiety that almost amounted to agony, I collected the instruments of life around me, that I might infuse a spark of being into the lifeless thing that lay at my feet.

Automation -- the route to orchestration

We're all familiar with automation -- the use of technology and machines to undertake humanly impossible, repetitive or dangerous tasks. Automation has helped reduce human errors -- and deaths -- and increased productivity dramatically. Much of the industrial revolution in the 18th century was built on automation, from the use of steam power that replaced muscle and brawn to the spinning jennies of textile manufacturing that replaced handweaving.

Today's "software robots" perform similar roles where human involvement is minimal or nonexistent, from process monitoring to automated accounting systems. Their ubiquity is a testament to their efficacy. For example, they can start a web server, automatically order printer ink when levels are low or copy an incoming email to a specific folder.

Orchestration is related to automation, but it operates with sets of already automated IT processes and workflows and brings them together into a coordinated whole. Built on a layer or level of composition above the task or individual component of work, orchestration composes larger, more complex systems out of these smaller parts, likely across multiple systems. This also not only enables the building of complex workflows and systems, but permits the composition, deployment and management of these systems at scale. Such things are often impossible to accomplish manually.

For example, orchestration can help manage multiple IT tasks. In the case of our printer, orchestration not only helps order the ink cartridges, but knows when they arrive and gets a human to install them. Apart from automating incident workflows, orchestration can provision servers, run up and down web services based on incoming workloads, undertake database provisioning and management, and more.

Illustration of storage orchestration
With orchestration, automated parts can be managed at vast scales.

Orchestration tools

Many of the orchestration tools available are open source, and there are plenty of them. This is not an exhaustive list, but they include Kubernetes, OpenShift and Swarm. Some are fully commercial offerings, such as those available from AWS or Microsoft Azure.

Most employ declarative programming concepts. In contrast to imperative programming, which describes how to perform a task, a declarative program describes what must be done to perform the task. To use a musical analogy, sheet music is declarative in that it tells us what must be done, with its declarative description of individual instruments and the notes they are to play.

This is the key to a high degree of orchestration, sometimes called infrastructure as code (IaC). Here, it's possible to manage the physical and virtual infrastructure built upon it with definition files or mainly declarative programs where we don't need to physically configure the hardware or use interactive tools to achieve our goals. A piece of software can do it for us.

Kubernetes and containers

Modern applications are increasingly built using the concept of microservices. These services are described as "fine-grained." They generally provide one small function that many applications can reuse -- for instance, the automated opening of a problem ticket. The protocols for communicating with these microservices are lightweight, such as the ubiquitous HTTP. Containers are applications made up of small reusable parts connected by efficient network protocols, all packaged up together with their dependencies and configurations.

Kubernetes deploys and manages those containers at scale. It has become the dominant technology in this space. It has a thriving community of DevOps and end users, so I'll use it as the poster child for storage orchestration.

Applications often grow to span multiple containers deployed across many servers, and that increases the complexity of their management. Kubernetes solves this by managing and assigning compute, storage and other resources to containers and scheduling where and when those containers will run. Containers are virtually grouped into a pod, the basic operational unit for Kubernetes, and those pods are then scaled to meet the computational demand.

Kubernetes also manages service discovery (what does this container do?), handles load balancing (are the containers making optimum use of the resources?), tracks resource allocation (do I have enough resources to meet the requirement?), and scales up and down based on utilization (if the demand increases or decreases, should I run more or fewer?). It also monitors the state and health of resources. When there are problems, it enables applications to provide self-healing capabilities by automatically restarting or replicating containers.

Containerized storage

Storage has always been difficult to manage because it has state, or persistence. Given a stream of data sent to a storage system, it guarantees that it will return the same stream of data in a performant way at some indeterminate point in the future. That's a promise that is sometimes hard to keep.

Automation of small tasks or processes has enabled their efficient management. Orchestration takes that one step further -- automated parts can be coordinated and managed at scales beyond human ability.

Storage in Kubernetes is managed through the Container Storage Interface (CSI), a standard for connecting block and file storage systems to containerized workloads in a device-independent way. Essentially, CSI permits the declaration and use of storage interfaces that are available to the container. It delivers on the promise of IaC.

Originally, Kubernetes only allowed ephemeral storage. It was only possible to save and retrieve a stream of data during a single execution of the container. Any restart got you a blank slate, and everything in the last execution of this container's storage was forgotten.

Many applications don't work like that. In fact, it's hard to write applications that are completely independent of data change. Imagine, for instance, an airline booking system that couldn't remember how many passenger tickets it had sold or who had booked seat 3A.

It's taken a little time, but now, it's possible to use a storage subsystem through a suitable CSI driver that has persistence. And admins can now also share that storage between containers. This development improved the ability to support stateful applications. However, it brings a set of other potential issues associated with traditional use and management of storage, such as backup and restore, replication, and checkpointing for disaster recovery and consistency.

Does orchestration help?

Often, scale is the enemy of efficiency. At scale, management becomes hard, and the first approach is to at least automate those parts that we can. Automation of small tasks or processes has enabled their efficient management. Orchestration takes that one step further -- automated parts can be coordinated and managed at scales beyond human ability.

IaC has revolutionized the way we think about physical resources. Software-defined storage enables us to provision, deploy, monitor and manage storage without reference to the hardware. And so, CSI makes storage as easy to manage as other software-defined resources of the runtime environment.

Where orchestration doesn't help is when we've automated inefficiencies; it won't fix them or make them better. Effectiveness, or doing the right things well, is not the same as efficiency. As one anonymous wit has said: "Automation may be a good thing, but don't forget that it began with Frankenstein." Orchestration still takes human skill and expertise as we infuse the "spark of being" into the lifeless parts.

About the author
Alex McDonald is an independent consultant and chairman of the
Storage Networking Industry Association (SNIA) Cloud Storage Technologies Initiative, committed to the adoption, growth and standardization of storage in cloud infrastructures and its data services, orchestration and management, and the promotion of portability of data in multi-cloud environments.

SNIA is a nonprofit organization made up of member companies spanning IT. A globally recognized and trusted authority, SNIA's mission is to lead the storage industry in developing and promoting vendor-neutral architectures, standards and educational services that facilitate the efficient management, movement and security of information.

Dig Deeper on Data storage strategy