Server virtualization is the de facto standard way to deploy applications today, and part of the process of implementing...
an efficient virtualized environment is to ensure the data storage layer delivers the required availability and performance. Accordingly, hypervisor vendors continually evolve their products to improve storage interaction and management with external appliances or arrays and, more recently, internal storage offerings.
For example, key attributes are starting to be applied at the service and policy level based on individual virtual machines. These allow you to create profiles for each VM or app that describe what resources they should get and what priority they have.
Let's dive further into the latest in hypervisor virtualization when it comes to storage and data management functionality, examine how internal storage is being groomed as an alternative to traditional shared external storage products, and delve into new application virtualization technology in the form of containers to see how vendors address storage on that ecosystem.
When data storage virtualization first emerged, internal storage was dedicated to direct attached storage in each hypervisor host. This didn't provide much in the way of scalability and resiliency (unless deployed with an internal RAID controller), though, and was one of the reasons vendors moved to support external storage.
Over the last few years, however, major hypervisor virtualization vendors have seized upon the opportunity to move storage back into the host because of the reliability of storage media and the need to reduce latency, achieved by eliminating the storage network.
VMware Virtual SAN (vSAN) is a scale-out distributed storage layer that provides storage capacity, performance and redundancy by combining multiple hard drives and flash storage in each physical vSphere/ESXi host. VMware targets vSAN, which was first released in 2014, for virtual desktop infrastructure, tier 2/tier 3 applications and disaster recovery environments. And, because vSAN is integrated into the ESXi kernel, updates follow ESXi releases, either as major platform releases or planned updates.
Other hypervisors and storage
You don't see the same level of innovation in how storage is handled in other leading hypervisors -- such as Citrix XenServer, KVM and Oracle VM -- as with VMware and Microsoft. XenServer does have the ability to perform live virtual machine migrations (as does Oracle VM, which is based on Xen), while KVM also offers live migration, but not storage migration capabilities. And since neither Xen nor KVM-based systems have focused on the VM as the object on which to apply policies, this puts them in a trailing position behind VMware and Microsoft as far as hypervisor storage support and management as well.
The vSAN feature list has evolved to include scalable snapshots, a file system based on VMware's Virsto acquisition and support for hybrid and all-flash configurations, stretched clusters, the VSAN Witness appliance and nonvolatile memory express. The most recent 2016 release brought deduplication and compression for all-flash arrays, RAID-5/6 erasure coding, another new on-disk format with changes to the layout of data on disk (version 3.0), software checksums, IOPS limits and health/performance monitoring.
Now, vSAN is a core part of VMware's vSphere platform -- and future revenue, as VMware licenses vSAN separately -- especially for hyper-converged deployments.
Microsoft Hyper-V exploits many of the storage enhancements developed for Windows Server. For internal storage, this includes Storage Spaces and the NTFS extension Resilient File System (ReFS).
With Storage Spaces, you can combine local physical disks into resilient storage pools for VMs -- as a single large pool, for example, or grouped as pools of capacity or performance storage.
Microsoft rolled out many enhancements to Storage Spaces in Windows Server 2012 R2, including the following:
- Storage tiers: for using both HDD and SSD drives in an automated tiering model that dynamically moves data between media types, depending on frequency of access.
- Write-back caching: to use SSDs for accelerating write requests, confirming them to the application after writing. This feature is targeted at small-sized random writes.
- Automated storage rebuilds: rebuilds storage pools from pool free space rather than using hot spares.
- Dual parity support: to cater to double-disk failures by creating multiple mirrors of data on separate physical disk devices (more mirrors/replicas of data means more tolerance to hardware failure).
Microsoft ReFS provides higher levels of resiliency and scalability by delivering improved ways to recover from logical file corruption and hardware failure. With Windows Server 2016, ReFS becomes the default recommended file system for Hyper-V. Enhancements include significantly faster creation of fixed-size VHDX files, quicker merging of Hyper-V checkpoints (snapshots) and Hyper-V Replica performance improvements.
Hypervisor vendors have delivered features to address both internal and external storage requirements. This provides customers with the choice to build virtualization products in either hyper-converged or traditional configurations, depending on their skill set or scalability and performance requirements.
Windows Server 2016 also adds features to support local storage. Storage Spaces Direct, for instance, allows Windows servers to be used to build scalable storage appliances that can compete with Microsoft Hyper-V in the hyper-converged marketplace. In addition, the requirement in Windows Server 2012 and earlier that clustered Storage Spaces configurations needed access to shared storage, with an SAS fabric (for example), has been replaced with the ability to connect servers over the IP network using SMB and SMB Direct (with RDMA).
VMware has always worked well with external storage. The ESXi hypervisor supports Fibre Channel, Fibre Channel over Ethernet (FCoE), iSCSI and NFS protocols, but not SMB. And, over successive releases, the company's relaxed many of the support restrictions around LUN numbers and sizes. LUNs now scale to 62 TB, for example.
To improve I/O performance and functionality, VMware has introduced many new features in the form of APIs. VAAI (vStorage API for Array Integration), added to ESXi 4.1, offloads some of the more intensive I/O tasks to external storage. This includes the ability to zero out large ranges of "empty" or binary zeroed data. Bulk data copying can now be performed by the storage array, quickening the time taken to clone VMs, and VAAI's Atomic Test & Set feature improves sub-LUN locking of VM objects.
There's also VMware Virtual Volumes (VVOLs), which are a way to overcome some of the issues of using LUN-based storage. A single LUN can hold many VMs, each with the same service level applied to it -- whether it relates to performance, resiliency or availability. With VVOLs, VMware established the concept of a VM object against which service levels can be individually applied. So while VVOLs still use LUNs that map to each component of a VM, such as a VMDK, snapshot, config file or swap file, the specifics of the implementation are hidden from the VMware administrator. VVOLs, delivered as part of VASA (vSphere APIs for Storage Awareness), also allow an external storage array to export capability details to the hypervisor over an out-of-band -- typically, IP-based -- connection.
Despite some notable exceptions (Hewlett Packard Enterprise 3PAR, for instance), vendors have been slow to support VVOLs, probably because there are significant engineering challenges involved. In addition, with the need to support tens of thousands of LUNs and allow hypervisor virtualization to drive storage allocations, most of the heavy lifting is done at the storage array vendor end. This may also be an issue in adoption, as storage teams typically don't like to delegate LUN creation to other IT teams.
Microsoft bases its support for external storage on Windows Server compatibility with the FC, FCoE, iSCSI and SMB protocols. Although Windows Server supports NFS, NFS shares can't be used for Hyper-V VMs.
Data services are a key part of virtual server offerings. These address the need to apply data placement and optimization decisions to a VM, so as to offload some processor intensive tasks or implement data protection. Both VMware and Microsoft have focused on bringing these features into their virtualization platforms. Features include support for block-level backup to allow VMs to be backed up while remaining online and running. There are also data optimization services, such as compression and dedupe support. While data placement allows a VM to be created on the storage pool that matches the service policy assigned to the VM when it is defined.
SMB as an external storage protocol for Hyper-V allows admins to use Windows Server running as a Scale-Out File Server for shared storage. This can be extended to third-party vendor products as well, although the only company to have taken that approach so far is Violin Memory with Windows Flash Array.
Windows Server implements external array offload functions through a feature called ODX (offloaded data transfers). ODX reduces network traffic and offloads replication functions to the array, when available. While Microsoft provides little information on ODX support, it appears both Nimble Storage and NetApp support it in their products.
Both VMware and Microsoft offer a range of data management features for storage in virtualized environments and hypervisor virtualization.
VMware takes a policy-based approach with Storage Policy-Based Management (SPBM) for applying data management standards to both internal and external storage capacity. Features such as Storage DRS let admins move VMs around infrastructure to balance storage performance and capacity. Storage vMotion, triggered by an administrator or directly through DRS, acts as the data mover. With VADP (vStorage APIs for Data Protection), VMs can be protected and backed up even while running by taking data directly from the hypervisor virtualization.
Windows Hyper-V implements storage policies through features such as Storage Quality of Service, which lets you manage I/O throughput at the virtual hard disk level. Storage QoS is expanded in Windows Server 2016 to allow admins to set policies on Scale-Out File Server and apply them to Hyper-V virtual disks. This effectively means you can centralize policy setting at the storage level rather than on each Hyper-V instance. There's also Microsoft's Live Migration facility, for dynamically moving VMs between storage platforms while in use.
With the release of Windows Server 2016, Microsoft will also offer a native changed block API for taking backups as an alternative to Volume Shadow-Copy Services. It also has better support for data deduplication, which should improve the performance of VMs running on internal Hyper-V storage.
Containers, especially as popularized by Docker, are a new way to virtualize applications that complement the current use of VMs. So whereas VMs use virtual objects to represent physical disks (like VMDKs and VHDs), a container effectively exists as a set of processes running on a host machine. In the case of Docker, storage is provided in two ways: exporting a host file system as a mount point into the container itself or by using a volume plug-in.
Docker first previewed volume plug-ins in release 1.8.0, and has extended them ever since. These allow external storage vendors to write plug-in code that automates the creation and mapping of a LUN or file system to a Docker container. The plug-in API offers basic functionality -- such as create, remove, mount and unmount -- to ensure external storage is visible to the container.
The implementation of container storage is still in the early stages of development. Connectivity is dependent on the host running the container, so issues of data portability remain, although companies such as ClusterHQ with Flocker are looking to fix this. Nonetheless, many new startups are coming to market to deliver storage both for and within containers. These are a mix of hardware and software offerings that apply policies, such as QoS, at the container level.
Pros and cons of storage virtualization
Pulse check on virtualizing data storage
Data storage virtualization pulse check