Container technologies like Docker are the new virtual machine (VM) and they are getting ready to take the enterprise data center by storm. Containerization is essentially application virtualization instead of server virtualization. Think of containers as virtual machine subsets; they may be an application or even parts of an application that ties back to a master instance of an application or virtual machine. But like their VM counterparts, container and Docker storage will require unique capabilities.
The virtualization problem
Server and desktop virtualization certainly work and are now the standard application deployment for most data centers. Why then should these same data centers look at Docker and other container technologies? Containers overcome two fundamental problems with virtualization:
- In most cases, the virtual machine is overkill. What the data center needs is the ability to run multiple applications safely on the same physical server at the same time. Siloing these applications so that errant code in one won't cause other applications to crash is an essential requirement. Additionally, there is a need to be able to allocate host server resources like CPU, memory and storage discreetly between these entities. Prior to virtualization, IT had this control by running one application per physical server. Virtualization essentially recreates the entire physical server as a VM just to run one application.
- Virtualization is driven by a hypervisor. Its responsibility is to abstract the host server's resources, allowing the hypervisor to allocate them to the various VMs. The hypervisor overhead associated with managing this abstraction is a drain on performance.
The usual solution to both of these problems is to throw more hardware at it, typically in the form of CPU power instead of bare-metal servers. Again, this model works, especially since CPU power is plentiful. Containers just provide a more efficient alternative.
The container advantage
Containers are more efficient than VMs. Instead of virtualizing and abstracting an entire piece of hardware, they only abstract the application or parts of the application. This finer-grained virtualization means resources are not wasted abstracting redundant components. It also leads to lower CPU, memory and storage requirements.
Docker leverages a copy-on-write file system to create its containers. Typically, a master image is made and then containers are created from that master. Most hypervisors have a similar capability, but the image must be an entire VM. With container technology, that image can be much more precise. A container can be an offshoot of an application or even a subset of an application.
Docker storage considerations
Many of the Docker storage considerations are similar to the concerns about virtualization. There are some key differences, though. Docker was designed to use direct-attached storage, but as the environment matures, there is a need for various containers to share information across hosts and to be able to move containers. Shared storage enables high availability, shared access and container movement. But that shared storage has to be able to handle a more variable workload demand than the typical virtual server environment; from an I/O perspective, a Docker storage environment is more similar to a virtual desktop infrastructure with its hundreds of desktops depending on linked clones.
A Docker environment can scale from a few dozen containers to hundreds, if not thousands, of containers in seconds, and it can scale back down to a few dozen just as fast. Accommodating this variable scaling will require a system that can support a mixture of flash and hard disk drives. The system will also probably need to be scale-out in design to accommodate container growth.
Today, unlike VMware and Hyper-V, Docker is almost featureless in terms of storage abilities. The lack of features means the software on the storage hardware will need to be robust so that the enterprise can have access to the capabilities they are used to. Docker environments are frequently automated with RESTful APIs. Containers are often programmatically created, executed and removed. It makes sense then that the storage system itself be fully scriptable via RESTful APIs.
Since Docker and container technologies are in their infancy, there will be significant changes over the next few years. It's safe to assume that Docker will become better at managing storage, adding new protocols and providing storage services. The supporting storage system should be flexible enough to match these changes by supporting multiple protocols and have the ability to turn off specific features that may conflict with what is added to Docker.
Today, Docker is used for supporting application development and testing. An increasing number of data centers, and certainly most enterprise data centers, have more development efforts underway than ever. Docker is ideal for these environments. But just as virtualization worked its way into the business by first being used for lab testing, Docker and containers may also find a seat at the enterprise table after starting as a test/development tool.
Docker leads the container technology charge in cloud
Docker containers expected to emerge gradually in primary storage
Docker containers go commercial, gain Google support