BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Enterprise storage comes in many different shapes. Two of the most common -- and long-standing -- ways to add storage...
capacity to a network are through a storage area network or network-attached storage. There is much to consider when assessing SAN vs. NAS.
A SAN connects servers to storage in a network hardware fabric, usually through a switch that enables many servers to easily access the storage. From a server application and OS standpoint, there is no visible difference between accessing data in a SAN or storage that is directly connected. SANs, like direct-attached storage (DAS), enable block access to data.
NAS is a method of remote file serving. NAS devices access files using a remote protocol, such as SMB or NFS. That device operates as a NAS server with its own file system, handling the file I/O and enabling file sharing and centralized data management.
Use cases for SAN vs. NAS
The SAN vs. NAS decision comes down to the type of data you're storing in your data center. With block I/O, SAN is used; with file I/O, NAS is used. When comparing SAN vs. NAS, keep in mind that NAS turns the file I/O request into block access for the attached storage devices. SANs are the preferred choice for structured data -- data in a relational database. While NAS can handle structured data, it's usually used for unstructured data -- files, email, social media, images, videos, communications and any type of data outside of relational databases.
Object I/O for storage has become more prevalent because of its overwhelming use in cloud storage. As a result, the clear divide between SAN being used with block storage and NAS with file storage is becoming blurred.
As vendors move from block or file to object I/O for their storage needs, users still want to access data in the way they are used to: block storage for SAN or file storage for NAS. Vendors are offering systems with front ends that present a NAS or SAN experience, while the back end is based on object storage.
File vs. block vs. object
File I/O storage reads and writes data in the same manner as the user does on a drive on a computer, using a hierarchical structure, with files inside folders that can be inside more folders. NAS systems commonly use this approach, and it has a number of benefits:
- When used with NFS and SMB -- the most common NAS protocols -- a user can copy and paste files or entire folders.
- The IT department can easily manage these systems.
Block I/O storage treats each file or folder as various blocks of smaller bits of data and distributes multiple copies of each block across the drives and devices in a SAN system. The benefits of this approach include the following:
- Greater data reliability. Data can still be accessed if one drive or several drives fail.
- Faster access. Files can be reassembled from the blocks closest to the user and don't need to pass through a hierarchy of folders.
Object I/O storage treats each file as one object, similar to file I/O, and doesn't have a hierarchy of nested folders like block I/O. With object storage, all files or objects are put into a single, enormous data pool or flat database. Files are found based on the metadata that is already associated with the file or added by the object storage OS.
Object storage has been the slowest of the three methods and is mainly used for cloud file storage. But recent advances in the way metadata is accessed and increased use of flash drives have narrowed the speed gap among object, file and block storage.
Differences between SAN and NAS
The main difference with SAN vs. NAS is reflected in how each type of storage appears to the user.
A NAS system or device is attached to a network via a standard Ethernet connection, so it appears to users like any other network-connected device. The user connects to the NAS to access files on it. The NAS device has an OS that manages the writing and reading of data the user's computer requests.
Once it has been mounted on a user's computer, a SAN will appear as a local drive. That means it will function as a local drive, and the OS on the user's computer will handle the commands to read or write data. This enables the user to treat it like any other local drive, including the ability to install software on it.
What is NAS?
A NAS system can be one NAS server, or a set of drives or servers in a single device. This gives the NAS system a direct connection to the network, generally using an Ethernet cable connected to an Ethernet switch.
NAS pros and cons
Ease of use is an advantage of NAS. Metadata in a NAS system is hierarchical and readable. Users use a simple file system browser to see file names and organize them into named folders.
With NAS, users can collaborate and share data, regardless of where they're located. NAS makes it easy to access files and folders from any network-connected device.
NAS also provides high capacity at a lower cost than SAN. NAS devices consolidate storage in one place and support data management and protection tasks, such as archiving, backup and cloud storage. And NAS can handle unstructured data.
Admins can outfit NAS appliances with more or larger disks to expand storage capacity. This approach is referred to as scale-up NAS. They also can be clustered for scale-out storage. Higher-end NAS devices can accommodate enough disks to support RAID.
NAS enables Portable OS Interface-compliant file access, facilitating centrally managed security and file access and ensuring multiple applications can share a scale-out NAS device without one application overwriting a file that another is using.
NAS isn't fast enough to meet the needs of high-performance applications. It can slow even further if too many users overwhelm a system with simultaneous requests. The use of flash storage in newer NAS systems, either in conjunction with HDDs or as an all-flash system, alleviates the speed problem.
Scalability issues can arise with NAS. Adding too many NAS devices can lead to NAS sprawl, especially if all the devices must be managed separately. Clustered, or scale-out, NAS was devised to mitigate that problem.
Data integrity is an issue, because file systems store metadata and file content across a logical or physical disk volume. If the file server loses power, the system must perform a file system check, or fsck, to validate the state of the data. The delay involved with doing a fsck can be significant, depending on the NAS system.
NAS' use of RAID can also be problematic, because RAID is reaching scalability limits. Rebuild times can take days for large drives, a situation that gets worse as multi-terabyte-capacity drives become common.
What is SAN?
A SAN is a pool of drives, devices or servers connected by a network fabric, such as iSCSI or Fibre Channel (FC).
Ethernet and fabric have competed on speed for years. However, fabric has always had the advantage, because it has a more direct connection that doesn't have to go through the TCP/IP handling of an Ethernet connection. When data speeds are equal, fabric has the I/O speed advantage, because the data gets touched less as it travels between storage and the user.
SAN pros and cons
SANs treat raw storage as a pool of resources that IT can centrally manage and allocate when it's needed. Because SAN connects over the network fabric, data transfers and access times using a SAN are faster than NAS, all things being equal in the SAN vs. NAS equation.
SAN systems are highly scalable; capacity can be added as required. Other reasons for deploying SANs include continuous availability and resilience. Highly available SANs are designed to have no single point of failure, starting with highly available SAN disk storage arrays and switches with redundant critical components and redundant connections to the SAN.
Cost and complexity are the main disadvantages of SANs. The hardware is expensive, and building and managing a SAN requires specialized knowledge and skills.
When assessing SAN vs. NAS, SAN is more complex, with dedicated cabling -- usually FC, but Ethernet can be used -- as well as dedicated switches and storage hardware. FC was developed specifically for storage, because Ethernet wasn't reliable enough to transmit block data before advances were made to the protocol. But FC SANs require specialized expertise and dedicated connectivity.
While SANs are highly scalable, scaling a SAN array vertically is limited. Once the scale-up limit is reached, it's necessary to move to a higher-performance array or add multiple storage arrays. An increasing number of SAN disk arrays avoid this problem by supporting horizontal scale-out where storage nodes are added that scale capacity and performance simultaneously.
How DAS fits in
DAS is a dedicated server or storage device not connected to a network. The simplest DAS is a computer's hard drive. To access files on DAS, a user must have access to the physical storage.
DAS can outperform NAS, particularly for compute-intensive programs. However, with DAS, the storage on each device must be managed separately, making system management more complex. DAS systems generally don't offer advanced storage management features, such as replication, snapshots and thin provisioning, that are common in SAN and NAS.
DAS also doesn't enable shared storage among multiple users. And because only one host accesses a DAS device, only a portion of the available storage is used.
The rise of unified storage
The emergence of unified storage has provided the flexibility to run block or file storage on the same array. These multiprotocol systems consolidate SAN block-based data and NAS file-based data on one storage platform. Customers can start with either SAN or NAS and add support and connectivity later. Or, they can buy a storage array that supports both SAN and NAS.
Unified storage can use file protocols, such as SMB and NFS, along with block protocols, such as FC and iSCSI. One advantage of these systems is they require less hardware than traditional storage. And newer unified storage offerings are incorporating cloud storage and storage virtualization.
The NVMe advantage
The greatest amount of action and excitement today comes from extending the NVMe protocol over fabric.
The NVMe protocol is the fastest way to connect a flash storage device to a computer's motherboard, communicating via the Peripheral Component Interconnect Express bus. It greatly outperforms an SSD connected via SATA. Imagine if you could extend that speedy NVMe connection across the fabric that knits together a SAN system.
NVMe can't be used to transfer data between a remote end user and the storage array, so a messaging layer must be used. This makes NVMe seem more like an Ethernet-connected NAS system, which uses Ethernet's TCP/IP protocol to handle data movement. But NVMe over fabrics developers are working on using remote direct memory access to ensure the messaging layer has less impact on speed.
Stand-alone SAN/NAS keeps wary eye on hyper-converged storage products
Object-level storage is ready to replace NAS
Take these precautions when integrating NAS and SAN