BACKGROUND IMAGE: stock.adobe.com
The latest storage media drives vary from high-performance, low-latency storage class memory and NVMe flash SSDs to lower-performance 3D QLC write-optimized flash SSDs and -- of course -- higher-density HDDs. Taking advantage of these different performance and price points has once again spiked interest in storage tiering strategy.
There are, arguably, two types of storage tiering: intrasystem and intersystem. Intrasystem tiering enables different classes of SCM, flash SSDs and even HDDs to be efficiently used within the same storage system. Intersystem tiering is more complicated, especially when the systems involved are made up of different types of storage, vendors, data centers, technologies and clouds. But intersystem tiering can potentially have much lower costs and more flexibility.
Intrasystem storage tiering strategy
Intrasystem tiering became popular when flash SSDs were introduced into the storage ecosystem. Flash SSDs are high-performance and low-latency storage media -- typically, multiple orders of magnitude better than HDDs. They were also quite expensive when they first came out. Tiering was a way to get SSD performance for hot data and move it to HDDs as it cooled off.
These systems were called hybrid storage. But, as SSD pricing declined precipitously, the cost difference between flash SSDs and HDDs narrowed to a point where the flash SSD performance advantages overwhelmed the minor increases in cost, and SSDs became more predominant.
That made sense when a storage system had a single flash SSD type. But now there are SSDs with a range of performance, capacity, wear life and latency capabilities, and there's a new class of SCM SSD possibilities. Tiered storage makes sense again, even when there are no HDDs in the system.
Best practices to optimize intrasystem storage tiering strategy include:
- Ensure tiering is policy-driven.
- Target the hottest data on the fastest drives, such as SCM and NVMe flash SSDs.
- Move data to slower, lower-cost, higher-capacity 3D triple-level cell SSDs as it ages and as time since last access or last modification increases.
- If there are more than two tiers, move data a second time to slower, lower-cost, higher-capacity, read-optimized 3D quad-level cell SSDs based on the same criteria.
Intersystem storage tiering strategy
Intersystem storage tiering historically hasn't been widely used because of the labor-intensive, manual processes involved when moving data from one system to the other. Interest has spiked, however, with the advent of the cloud, big data and data lake analytics. Past methodologies, such as hierarchical storage management (HSM), and open source manual file copying utilities, such as rsync and Robocopy, aren't going to cut it.
HSM is based on stubs. When data is moved from one system to another, it leaves a small stub in place of the original data. When applications or users access the data, they're actually accessing the stub, which goes and retrieves the data, rehydrating it to the original storage.
There are, however, problems with HSM, including increased cloud storage costs. It's relatively cheap to store data in the cloud, but there are egress fees when copying data out, as happens with HSM. This approach is also binary. If the data is moved from the secondary storage to another storage system or cloud, the HSM stub breaks because it can't find the data.
Better intersystem tiering is available today. Some products -- such as Dell EMC's ClarityNow, Hammerspace, Komprise and StrongBox Data Solutions' StrongLink -- mount the primary storage system with admin privileges. This enables the tiering software to read all of the data. It then copies it out based on policies to one or more secondary or tertiary storage systems, including cloud and tape. Policies allow the original data to be deleted from the original storage, while a global namespace enables direct instant access to the data where it resides instead of rehydration to the original storage.
Others products, such as InfiniteIO, sit in front of the storage, looking like a switch. Data is moved from one storage system to another or to a cloud via policy. It should be noted that this type of tiering is primarily used for unstructured data, which represents more than 80% of stored data.
Single-vendor intersystem storage tiering is useful if all data storage is through that vendor. In these cases, the tiering operates somewhat similarly to intrasystem tiering. Data is replicated between systems or to a virtual storage appliance (VSA) operating on premises or in the cloud. Once replicated, data is deleted from the original source based on policy. The target system or VSA then uses lower-cost block, file, object or cloud object storage.
Best practices to optimize intersystem storage tiering strategy include:
- Ensure tiering is policy-driven.
- Target cool and cold data to lower-cost, lower-performing storage, such as object, distributed file, cloud and tape library storage.
- Move data based on age and time since last accessed or modified.
- If more than two tiers, move data again to slower, lower-cost storage, such as cold, archive, cold or archive cloud, and tape library storage.