Published: 04 May 2012
Direct-attached storage may seem passé, but it’s making a comeback and gaining widespread interest.
Direct-attached storage (DAS) is storage installed in a server or external cabinet that’s still connected directly to the server. DAS is storage that’s essentially captive to a particular server, so the server doesn’t need to traverse a network to read and write data.
DAS has been criticized as an inefficient way to connect storage to a server and as an obstacle to the data protection process. Storage that’s locally attached can’t be shared, which can lead to a situation where one server can be running out of disk capacity while others have plenty to spare. Without shared storage, there’s no way to balance capacity demands.
DAS could complicate the data protection process because each server would have to be backed up individually and the data copied across the network. Alternatively, each server would have its own locally attached tape device and backup application that would add even more complexity to the backup process.
Shared storage in the form of a storage-area network (SAN) or network-attached storage (NAS) device was supposed to address these issues and thus hasten the extinction of DAS. But DAS is still a common method of attaching storage to a server; in fact, it’s enjoyed something of a comeback in recent years. The resurgence reached new heights this year when EMC announced a PCI Express (PCIe)-based solid-state storage product designed to enable its networked storage systems to store some data locally on the server.
SAN and NAS underdeliver
One reason DAS continues to live on is that SAN and NAS have largely underdelivered on their promises. SANs were supposed to make it easy to create a global pool of storage that could be dynamically divvied up among servers so that only the capacity actually needed at the time was assigned to a server. For the first eight years or so of the technology’s existence, this capability was largely unavailable, and SAN storage had to be hard partitioned to individual servers. When a server needed more capacity, a new partition had to be allocated to that server and then concatenated into the existing storage pool on the server or, worse, managed separately. The process of adding storage to a server on a SAN was very similar to the prior DAS methodology.
Data protection was also supposed to get a lot easier. The goal was to back up the SAN directly and not have to back up the individual servers. While a few software applications were able to accomplish that feat, all suffered from blindly backing up data and not understanding what that data was. Users quickly realized they needed something called “application awareness” to back up active applications and then perform intelligent restores. As a result, some form of backup software was required on the servers.
Finally, the price of SAN or NAS technology is still significantly higher than DAS. Many users have decided it’s less expensive to inefficiently directly attach storage than to efficiently share it.
To be fair, modern SAN and NAS implementations have addressed the early storage allocation shortcomings with technologies like thin provisioning. However, the time it took to deliver on the allocation promise allowed DAS to build on its foothold in the data center. But the other challenges remain, for the most part.
The primary driver for SAN/NAS adoption has been the advent of server and desktop virtualization, since the ability to move virtual server images between physical hosts requires shared storage. Virtualization also makes application-aware, off-host backup viable due to the entire server being a file that can be backed up without interacting with the original physical host. But despite this new and important use case for shared storage, DAS continues to live on in the data center. And its value is increasing.
One of the key reasons for DAS’s continued popularity in the data center is the need for a local boot drive. While most SANs support some form of booting methodology, it still requires specialized host bus adapters (HBAs) and specific support on the SAN storage system. As a result, most physical servers still boot from DAS storage.
Thanks to solid-state drives (SSDs), booting from the local server offers some specific advantages over booting from the SAN. First, servers can now be booted or re-booted in seconds from a local SSD. And the SSD can be used as a virtual memory paging area, which is incredibly important in virtual environments. As hosts in these environments get loaded up with virtual machines (VMs), they can quickly run out of RAM and begin to use local storage as a memory paging area. If this local storage is hard disk, performance can degrade substantially. When this local storage is memory based, like flash SSD, the drop in performance is negligible. SSD as a boot drive allows for more virtual machines without the need to purchase expensive RAM.
Extending the SAN with DAS
Solid-state storage also plays another role in the resurgence of DAS adoption: as an extension to the SAN. Leveraging even higher performing PCIe-based solid-state storage, architectures are now developing that allow the tiering or caching of data directly to the server needing it. PCIe SSDs can communicate directly with the CPU and don’t get bogged down by SAS or SATA protocols like typical SSDs. This again makes an ideal virtual memory paging area for RAM-constrained systems, but it’s the tiering or caching use case that’s becoming increasingly interesting.
With this architecture, storage systems can intelligently pre-stage the most active data within the PCIe SSD. Then, when a request for data is made by an application or user, it will be available for high-speed delivery on the PCIe SSD. This means the application or user doesn’t have to wait for the request to travel across the storage network, be accepted and processed by the storage controllers, wait for hard drives to rotate into position and then send the requested data or write acknowledgment all the way back up that infrastructure.
If successful, this model of storage architecture design would turn the SAN world upside down. Storage on the SAN would become the central repository of information that’s growing cold and the local PCIe SSD DAS would be used for the most active data. The SAN would be used for long-term retention or backup, and the server would be used for active processing. This would lead to SAN storage system designs where capacity is the focus and performance is less important. But the one downside to native PCIe SSDs is that you can’t boot from them, so a local SAS hard drive or even an SSD in a drive form factor would still be required.
Cloud compute infrastructure
Other key drivers for the revival of DAS are the designs of massive storage environments like those of Facebook, Google and others. These systems combine compute and storage on a single server that’s highly networked for communication with the other servers. These systems often have locally attached storage and the ability to access data on other servers. They can even leverage a combination of PCIe SSD and hard disk drive (HDD) for booting. These online providers and Internet technology companies chose this design so they could get incredibly cost-efficient architectures with the ability to scale easily as new servers were added.
This model of DAS converged with compute was thought to be a limited use case, one that only companies with large online apps would deploy. Now, however, thanks again to server virtualization, there’s often a need to build scalable compute and storage infrastructure simultaneously. Vendors like Nutanix offer products that are clusters of servers with internal storage to provide a turnkey cloud compute-type of infrastructure suitable for more traditional data centers.
Server virtualization still needs shared storage to move virtual machine images and provide high availability. These converged architectures automatically copy data to the other nodes in the cluster so that the virtual machines’ images are available to any node in the cluster. This “shared DAS” model provides the simplicity and cost effectiveness of local storage while providing many of the benefits of a SAN.
If DAS lives, is SAN dead?
DAS isn’t just living, it’s thriving. There are many storage experts who believe the data center is moving toward a “DAS mostly” environment, as described above, where the SAN would become the long-term repository while truly active data gets stored locally on the server that needs it. The software to manage this movement of data is maturing quickly and will be used to keep active data locally. It will also be able to acknowledge the writing of new data locally and then sync that data to the capacity SAN in the background.
The drivers for a potential shift to this “DAS mostly” model are the performance demands of the virtual environment and the performance capabilities of solid-state storage. One driver has a need for data locally and the other has the ability to leverage local data by avoiding the latency caused by the storage network.
Still lots of storage options
As always, there are a lot of potential options for a storage administrator when dealing with storage challenges. The first step is to invest in a performance analysis tool that can help fine-tune the current environment. This maximizes the current investment and allows for an informed decision when selecting what step to take next.
If the network or storage infrastructure can’t be upgraded due to budget or time constraints, then a valid approach would be a strategy of mixing SSD-based DAS with SAN storage. This would provide the benefit of improved performance by eliminating the storage network bottleneck for maximum SSD benefit.
If a refresh is in the budget, an investment could be made in storage network infrastructure and a shared storage system, such as an all-flash device to eliminate storage performance concerns for the foreseeable future. Still, with this approach, using SSD DAS as a booting and paging device can complete the storage performance picture.