This article can also be found in the Premium Editorial Download "Storage magazine: Optimizing your enterprise database storage."

Download it now to read this article plus other related content.

I'm often asked what the value of storage virtualization is and what the best ways to use it are. Without a doubt, in a well-thought-out implementation, virtual storage provides many cost-saving benefits. In fact, there's practically no way to avoid it because RAID virtualization is so common in disk subsystems. But unplanned virtualization can spread data over many disks in an array, and in the process, unwittingly introduce major performance problems.

    Requires Free Membership to View

Planning for virtulaization
Structure disk allocations to workloads.
Limiting RAID choices to 0 and 0+1 makes the process of mixing applications across disks much easier.
In transaction processing environments, RAID 0+1 has the major advantage of spreading out the I/Os over a number of disk drives.
Make sure that the array drives have similar specifications so as not to create a bottleneck.
RAID 5's read/modify/write penalty impacts the performance of all the applications storing data on drives in the array.
Distribute high I/O applications across all the disks in the array

The three main benefits of virtualization are redundancy for data, increased flexibility in using storage address space and boosting performance for applications that tend to have I/O bottlenecks.

Virtualization, of course, can be done in multiple layers in the storage area network (SAN)-the volume manager, HBA RAID controller, network device (appliance) and subsystem controller. The performance-improving suggestions I make in this article should be done at the virtualization layer closest to the disk drives.

Redundant data protection in the form of mirroring (RAID 1) or RAID 0+1 is widely available and should be incorporated into all system storage. Cost advantages of RAID 5 are not viewed as significant enough to overcome write performance penalties and degraded-mode operations. The choice between RAID 1 and 0+1 depends on the capacity, scaling and performance requirements of the application. If in doubt, use RAID 1 to simplify configuration and troubleshooting.

RAID 0+1 arrays can afford to have multiple disk drives fail, as long as they are not part of the same mirror-stripe. This is a big advantage over RAID 5 and RAID 1, where the loss of more than one disk drive results in a loss of data.

Storage address spaces
In addition to mirroring and striping, virtualization can also subdivide and concatenate storage address spaces. From the perspective of address space manipulation, virtualization can make all disk storage work like putty that can be merged in an endless variety of ways. However, taking this approach to storage is unlikely to result in optimal designs.

It's important to always keep in mind that disk drives are electro-mechanical devices that are performance-constrained by the rotation speed of the media and the time it takes to move the read/write heads over the media. The performance and cost differences between 5400 RPM ATA disk drives and a 10,000 RPM SCSI disk drives can be enormous. As a best virtualization practice, make sure that the drives that form an array have similar specifications, so as not to create a bottleneck.

Additionally, you need to know how disk subdivisions are allocated to applications and systems. For example, consider a scenario where 14 different file systems and/or databases are using the storage resources of 16 disk drives that are managed by a storage virtualization product. There are many ways this storage could be allocated, but one simple "cloud" method is to assign units of storage uniformly on a first-come, first-serve, round robin basis. The "Array distribution diagram 1" (Allocating disks: there's the wrong way ...) shows a collection of disks as you might find within a disk subsystem where all 16 disks have been subdivided into five equal extents (partitions). As each application comes online, an array or extents is allocated to it using either RAID 1 or RAID 0+1.

According to the first diagram, array 1 runs on disks 1 through 4, array 2 runs on disks 5 through 8, array 3 runs on disks 9 through 12 and so on. The other dimension to analyze is the distribution of applications-represented by arrays-on each disk. For instance, disk 12 has arrays 3, 9 and 13 and disk 13 has arrays 4, 9 and 14.

Allocating disk: there's a wrong way...
... and a right way

At first glance, the top disk subsystem looks like it is well set up, since each array is spread over several disks. However, if arrays 3,6,9 and 12, for example, are dedicated to applications with heavy workloads, then many disks all have two intensive applications on them. A better technique is shown at bottom, where each disk has the "bottom" extent reserved for a single heavy-duty application, and the rest for lighter-duty ones.

This was first published in November 2002

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: