Storage.com

distributed file system (DFS)

By Ryan Arel

What is a distributed file system (DFS)?

A distributed file system (DFS) is a file system that enables clients to access file storage from multiple hosts through a computer network as if the user was accessing local storage. Files are spread across multiple storage servers and in multiple locations, which enables users to share data and storage resources. A DFS can be designed so geographically distributed users, such as remote workers and distributed teams, can access and share files remotely as if they were stored locally.

How a DFS works

A DFS clusters together multiple storage nodes and logically distributes data sets across multiple nodes that each have their own computing power and storage. The data on a DFS can reside on various types of storage devices, such as solid-state drives and hard disk drives.

Data sets are replicated onto multiple servers, which enables redundancy to keep data highly available. The DFS is located on a collection of servers, mainframes or a cloud environment over a local area network (LAN) so multiple users can access and store unstructured data. If organizations need to scale up their infrastructure, they can add more storage nodes to the DFS.

Clients access data on a DFS using namespaces. Organizations can group shared folders into logical namespaces. A namespace is the shared group of networked storage on a DFS root. These present files to users as one shared folder with multiple subfolders. When a user requests a file, the DFS brings up the first available copy of the file.

There are two types of namespaces:

  1. Standalone DFS namespaces. A standalone or independent DFS namespace has just one host server. Standalone namespaces do not use Active Directory (AD). In a standalone namespace, the configuration data for the DFS is stored on the host server's registry. A standalone namespace is often used in environments that only need one server.
  2. Domain-based DFS namespaces. Domain-based DFS namespaces integrate and store the DFS configuration in AD. Domain-based namespaces have multiple host servers, and the DFS topology data is stored in AD. Domain-based namespaces are commonly used in environments that require higher availability.

Advantages and disadvantages of a DFS

A DFS provides organizations with a scalable system to manage unstructured data remotely. It can enable organizations to use legacy storage to save costs of storage devices and hardware. A DFS also improves availability of data through replication.

However, security measures need to be in place to protect storage nodes. In addition, there is a risk for data loss when data is replicated across storage nodes. It can also be complicated to reconfigure a DFS should an organization replace storage hardware on any of the DFS nodes.

Features of a DFS

Organizations use a DFS for features such as scalability, security and remote access to data. Features of a DFS include the following:

Implementations of a DFS

A DFS uses file sharing protocols. Protocols enable users to access file servers over the DFS as if it was local storage.

Protocols a DFS can use include the following:

Open source distributed file systems include the following:

Vendors that offer DFS products

Various storage vendors offer DFS products and capabilities for unstructured data applications and workloads.

Vendors with DFS products include the following:

22 Nov 2022

All Rights Reserved, Copyright 2000 - 2024, TechTarget | Read our Privacy Statement