As originally conceived, containers were intended to be stateless homes for microservices. The agility and flexibility of the container ecosystem, coupled with their small resource footprints, fits the microservices concept well. As a result, containers resonated with the DevOps movement in IT to become the hottest technology of the decade and gave containers a rocket-assisted growth rate.
Inevitably, the issue of "stateless" came to be questioned. It turns out that, no surprise, real applications fit the container ecosystem model too. But real applications aren't typically stateless. Most apps have two forms of storage. The first is networked storage in its many forms, used for data interchange and information history. The second is transient, instance storage, used for scratchpads as the app instance runs.
Containers compared to virtual machines
Running an app in a stateless container, as opposed to a virtual machine, means that instance storage isn't a real option, which hinders recovery if the instance fails for any reason. It is possible to access local storage on a container host, though this may create security issues unless the container is resident inside a virtual machine. This isn't the issue. Virtual machine orchestration can rapidly restart an instance on another server if the current host fails, and this is a facility that containers software needs to support if containers are to move to mainstream IT.
Effective microservice and applet architectures require data to move between containers -- or perhaps have the containerized service instantiate where the data is; it's often much quicker. For real agility and flexibility, a vehicle for easy portability between containers is needed to fit the bill.
Current apps build storage on top of a wide variety of platforms, from object to block and from SAN to hyper-converged. For containers to supplant hypervisors completely, the container ecosystem has to cater to this wide breadth of storage options, too.
Here, there are some philosophy differences among the various players. Hypervisor supporters want stateless containers, since a full storage portfolio for containers probably dooms the hypervisor. Even some containers fans want to keep the purity of stateless containers, though this may just be a holdover from the early days of the container ecosystem, when a clear differentiation in purpose from hypervisors was essential to survival of the concept.
Most containers users are very enthusiastic about the agility and ease-of-use that containers bring. Couple those factors with the ability to put three to five times the instance count in a server and DevOps supporters see a home run. Adding persistent data to container options is thus a major roadmap must-have, and the industry is falling in line with that need.
Containers storage is still a messy, embryonic field of IT. Rather than a single point of convergence, or even a few points, the storage and containers vendors are rolling out their own solutions. The good news that we are seeing solutions is somewhat mitigated by the bad news that there are rather a lot of them, and they have different APIs and functions. This situation, though, reflects the enthusiasm around containers and is a healthy sign for the segment.
Products and vendors
Let's look at the spectrum of offerings in the container ecosystem. Portworx PWX allows a container to mount shareable elastic block storage. StorageOS takes this further and mounts a variety of external storage protocols and types, also providing compression, etc. Rancher Labs aims at local storage, while supporting data migration across servers. Microsoft Windows Server offers solutions for OS kernel-level sharing and also within Hyper-V instances.
The list goes on. ClusterHQ used Flocker, an open source product that allows creation of a space out of a pool of shared block storage that can move with a container, even across hosts. Flocker is supported by VMware and can interface with EMC and NetApp storage, among many others. But ClusterHQ folded tents, leaving Flocker without its main cheerleader.
There is activity within the core containers software. Kubernetes 1.6 and later allow storage on demand and multiple storage types, with StorageClass objects for all of the major cloud stacks, including OpenStack and vSphere as well as the Big Three public cloud service providers. Hyper-converged systems require their own secret sauce for containers storage, and vendors such as Nutanix are expected to step up to the plate in the near future.
It's interesting that containers insiders talk to "persistent data" not "storage." I've long felt that the movement of the industry toward object storage, software-defined storage microservices and a fine-grained containers virtualization all make traditional views of large files obsolete. Take a database. It really consists of thousands of record-sized objects. Moving to a fine granularity in storage may be the consequence of all of this technology evolution.
When you add nonvolatile DIMM (NVDIMM) into the equation, this gets more pressing. Within a couple of years, NVDIMMs will persist at the word level, as opposed to using a 4 KB block storage model. Handling containers storage may thus be a poor model for the future, and persistent data may turn out to be a profound choice.
Docker persistent storage given away by vendor
Docker container storage on vendors' minds
StorageOS releases persistent container storage