Cebreros - Fotolia

Microsoft Storage Spaces takes 'Direct' aim at storage clusters

Windows Server 2016 release makes Microsoft Storage Spaces Direct available to data centers. Highlights include heightened data reduction, HCI, replication and tiering.

Microsoft introduced its vision of software-defined storage, dubbed Windows Storage Spaces, in Windows Server 2012...

R2. The long-awaited launch in September of Windows Server 2016 extends the concept with Microsoft's Storage Spaces Direct, which allows administrators to create scalable high-availability clusters using local direct-attached server storage.

Storage Spaces Direct (S2D) is included as part of the Windows Server 2016 Datacenter edition. The inaugural Storage Spaces rollout used Windows Server software to apply disk mirroring for RAID 5 parity. To augment that scheme, the new version adds enhanced data reduction, native replication and additional tiering to nonvolatile memory express (NVMe) flash cards.

Storage Spaces Direct also takes Microsoft into converged or hyper-converged infrastructure (HCI) with Hyper-V virtualization.

Windows S2D pools server storage and presents it as virtualized disk. Capacity gets aggregated from internal server disk or network-attached storage housed in a JBOD enclosure. According to Microsoft, Storage Spaces Direct lets customers start with a four-node minimum and build petabyte-scale storage clusters up to 16 nodes and more than 400 drives.

"Microsoft is targeting entities that need large quantities of virtual machines. Maybe those companies are toying with going to all-flash, but they don't like the cost. Storage Spaces Direct acknowledges all the media types and lets you tweak any tier up and down for performance," said Trenton Baker, vice president of marketing at RAID Inc., a maker of high-density Windows-based storage servers and storage subsystems based in Andover, Mass.

Storage Spaces Direct acknowledges all the media types and lets you tweak any tier up and down for performance.
Trenton Bakervice president of marketing, RAID Inc.

The Microsoft Storage Spaces Direct distributed file system eliminates the need for shared SAS fabric. It turns the network into a high-performance communication layer, knitting server nodes together with Remote Direct Memory Access-capable adapters and a Server Message Block 3.0 converged Ethernet interface.

The RDMA offload engine in S2D allocates idle Intel Xeon processor cores as a software-defined networking switch. Users add servers or disk as they need more storage or I/O performance.

"The emphasis with S2D is taking any kind of direct-attached storage and software-defining it," said Greg Schulz, a Microsoft MVP and senior analyst for IT infrastructure firm Server StorageIO and UnlimitedIO LLC in Stillwater, Minn.

"Microsoft's SMB Direct storage software bus is the enabling technology for all the features. It's a block-like, low-latency layer running over converged Ethernet for you to scale out software-defined storage," Schulz added.

Storage resilience comes from erasure coding, parity combo

While Microsoft Storage Spaces moved data between a hot flash tier and cold tier of rotating disk, Storage Spaces Direct injects a third tier of NVMe flash. Windows Server recognizes an NVMe device upon its insertion, automatically tuning it as write-back cache. SATA solid-state drives are earmarked as a performance tier, augmented by a capacity tier of spinning disk for inactive data. The Windows kernel manager orchestrates and promotes data movement between cold, hot and warm tiers.

The inclusion of NVMe storage gives a jolt to capacity and performance, Baker said, formerly a vice president of business development at Microsoft OEM partner DataON Storage.

"Your writes will go to NVMe first to speed everything up and then go to the mirrored tier, which will line up the data to write sequentially to your lowest tier," Baker said. "The slow tier is a parity tier similar to RAID 5 or RAID 6, except it [provides] greater volume, since the data is no longer being mirrored."

The multiple tiers are underpinned by Microsoft's multiresilient virtual disk feature. For fault tolerance, part of a virtual disk can be set up as a mirror and part can be set up as erasure coding. Microsoft S2D's Resilient File System (ReFS) manages tiering and replaces NTFS as the default file system. ReFS thin provisioning and drive rebalancing help reclaim excess storage capacity.

Storage Spaces Direct pools local storage and makes a copy of data available to all nodes in a cluster. Since each node runs as a fault domain, it has visibility into disks residing on its companion nodes. In the event a disk fails, data is replicated to other disks to ensure the cluster always has at least three copies available.

"The other thing that's new in S2D is you can replicate your storage between two nodes. With Storage Spaces, you had to have third-party software to do replication," Schulz said.

Microsoft S2D signals move into hyper-converged storage

Windows Server 2016 also marks Microsoft's foray into hyper-convergence, mainly for building private cloud storage. The converged option layers a Microsoft Scale-Out File Server (SOFS) atop the Windows Server storage stack to serve as an active-active cluster. Storage and compute reside as separate clusters.

SOFS provides a target data store for VM files. SOFS accesses the network via SMB 3 to deliver continuous availability and failover to Hyper-V, SQL Server and other application servers.

Hyper-converged systems present all resources -- compute, networking, storage and virtualization -- as an integrated system on a single server footprint. In Microsoft's hyper-converged scenario, Storage Spaces Direct storage components share a cluster with Hyper-V compute resources. Hyper-V VMs place their files in Microsoft clustered storage volumes.

Microsoft is just starting to formalize hyper-converged OEM partnerships with hardware vendors, similar to what VMware has done with its Virtual SAN hyper-converged software. Lenovo last month rolled out a suite of offerings that package Windows Server 2016 and Microsoft Azure Stack on Lenovo rack servers.

Hewlett Packard Enterprise (HPE), Micron Technologies and Mellanox Technologies revealed a reference architecture for Storage Spaces Direct at Microsoft's Ignite conference in September. The vendors tested a 12-node cluster of HPE ProLiant DL380 Gen9 servers that were outfitted with Micron NVMe flash drives and connected by Mellanox 100 Gigabit Ethernet networking gear. HPE is spearheading marketing of the recommended configuration, starting with reference blueprints for Microsoft SQL Server 2016.

Much as Dell in its OEM deal with Nutanix, Microsoft is treading on territory served by some of its partners. HyperGrid Inc., for example, sells hyper-converged systems built for Hyper-V and Windows storage. Formerly known as Gridstore, the vendor changed its name in July.

Kelly Murphy, HyperGrid's CTO and one of Gridstore's founders, said he sees Microsoft's emphasis on hyper-convergence as a nod to his company's success.

"I think it sharpens our value proposition. If you think about how we might end up competing with Microsoft, you have to remember that Microsoft is still going to be the jack of all trades," Murphy said. "At HyperGrid, we are focused on one thing: delivering scalable infrastructure blocks to enterprises, and doing it fast and doing it seamlessly."

Murphy said HyperGrid plans to re-engineer its HCI appliances to exploit some of the features inherent in Windows S2D, particularly Nano Server, a highly condensed version of the full Windows Server operating-system software.

"Nano Server strips off layers of sediment that were originally included to support legacy applications and legacy interfaces. The Nano Server is about one-tenth the size of the traditional Windows Server OS," Murphy said. "We will now be able to build appliances that are specifically built and preintegrated with a complete stack. It makes it much easier for us to help customers stand up hyper-converged infrastructure."

Taking stock of the Microsoft Storage Spaces tool set

Windows S2D contains a number of other features that should interest storage and data center administrators:

Containers. Windows Server 2016 offers two options for deploying persistent storage containers: either as separate instances that share host resources, or as isolated Hyper-V containers encapsulated as discrete VMs.

Data deduplication. Microsoft redesigned its algorithm to run multiple threads in parallel and multiple I/O queues per volume. The previous version capped recommended deduplication volumes at 10 TB. S2D bumps supports to 64 TB volumes.

Microsoft Hyper-V. Hyper-V management includes host resource protection that prevents a single VM from hogging storage and other resources that could destabilize the VM cluster performance. Hyper-V shielded VMs encrypt VMs to prevent unauthorized access to its data.

Quality of Storage Service. This feature in S2D allows data centers to configure centralized management policies on Scale-Out File Server and assign the policies to multiple virtual disks or Hyper-V instances.

Storage Replica. This facilitates synchronous block-level replication among clusters for off-site disaster recovery, enabling a failover cluster to stretch across geographic sites to safeguard file-system data.

Next Steps

A preview of Windows Storage Spaces Direct

Cloud features prominently in Windows Server 2016

The rundown on Windows Server 2016 features

Dig Deeper on Storage tiering