Manage Learn to apply best practices and optimize your operations.

File virtualization strategies for managing unstructured data

File virtualization approaches vary depending on your IT infrastructure and how much unstructured data is managed. Learn four ways to virtualize file access.

What you'll learn from this tip: Discover four file virtualization strategies to help you manage unstructured data.

There are four main strategies for virtualizing file access: file system virtualization, clustered file systems, clustered NAS and NAS gateways. Each file virtualization approach is aimed at helping IT shops reign in unstructured or file-based data that often lives on a wide variety of locations throughout a system. The challenge is intimidating, and the technology surrounding access to unstructured data hasn't changed much in the last 15 years.

But big changes are happening now. NAS system architectures are moving toward more scalable, multinode scale-out architectures with global namespace support. NAS behemoth NetApp's incorporation of technology acquired from Spinnaker in its Ontap 8 release, enabling customers to build multinode NetApp clusters, is indicative of the change.

File system virtualization products are complementing traditional scale-up and next-generation scale-out NAS systems to provide a global namespace across heterogeneous file stores in the enterprise. While they're currently mostly deployed for the purpose of data mobility and storage tiering, they're likely to play a significant role in providing an enterprise-wide, global unified namespace for all unstructured data.

Here are the four file virtualization strategies you can choose from today:

1. File system virtualization (aggregation) is one way of virtualizing file access. At a high level, file system virtualization accumulates individual file systems into a pool that's accessed by clients as a single unit. In other words, clients see a single large namespace without being aware of the underlying file stores. The underlying file store could be a single NAS, or a mesh of various file servers and NAS systems. File system virtualization products address two main problems: They give users a single virtual file store; and they offer storage management capabilities such as nondisruptive data migration and file-path persistency while files are moved between different physical file stores.

One of the great benefits of file system virtualization is that it can be deployed in existing environments without having to rip out existing servers and NAS storage. On the downside, file system aggregation doesn't address the problem of having to manage each file store individually.

2. Clustered file systems are another way of virtualizing file access. Clustered file systems are part of next-generation NAS systems designed to overcome the limitations of traditional scale-up NAS. They're usually composed of block-based storage nodes, typically starting with three nodes and scaling to petabytes of file storage by simply adding additional nodes. The clustered file system glues the nodes together by presenting a single file system with a single global namespace to clients. Among the vendors offering NAS systems based on clustered file systems are FalconStor Software Inc.'s HyperFS, Hewlett-Packard (HP) Co.'s StorageWorks X9000 Network Storage Systems, IBM's Scale Out Network Attached Storage (SONAS), Isilon Systems Inc., Oracle Corp.'s Sun Storage 7000 Unified Series, Panasas Inc., Quantum Corp.'s StorNext and Symantec Corp.'s FileStore.

3. Clustered NAS is a third way of virtualizing file access. Clustered NAS architectures share many of the benefits of clustered file system-based NAS. Instead of running a single file system that spreads across all nodes, clustered NAS systems run complete file systems on each node, aggregating them under a single root and presenting them as a single global namespace to connected clients. In a sense, clustered NAS is a combination of a scale-out, multinode storage architecture and file system aggregation. Instead of aggregating file systems of heterogeneous file stores, they aggregate file systems on native storage nodes. The BlueArc Corp. Titan and Mercury series of scale-out NAS systems are prime examples of clustered NAS systems.

4. NAS gateways can also be viewed as file system virtualization devices. Sitting in front of block-based storage, they provide NFS and CIFS access to the block-based storage they front end. Offered by most NAS vendors, NAS gateways usually allow bringing third-party, block-based storage into the NAS and, if supported by the NAS vendor, into the global namespace.

NAS systems and gateways based on clustered file system or clustered NAS architectures are next-generation NAS systems and won't integrate with existing legacy file stores; they usually replace them or run in parallel with them. This makes them more difficult to deploy as well as more expensive than file system virtualization products. However, the benefit of having to manage a single NAS, rather than many small data silos that are simply aggregated by a file system virtualization product into a single namespace, more often than not justifies the additional effort and cost.

BIO: Jacob Gsoedl is a freelance writer and a corporate director for business systems. He can be reached at

Dig Deeper on Unstructured data storage

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.