This article can also be found in the Premium Editorial Download "Storage magazine: Optimizing your enterprise database storage."
Download it now to read this article plus other related content.
I'm often asked what the value of storage virtualization is and what the best ways to use it are. Without a doubt, in a well-thought-out implementation, virtual storage provides many cost-saving benefits. In fact, there's practically no way to avoid it because RAID virtualization is so common in disk subsystems. But unplanned virtualization can spread data over many disks in an array, and in the process, unwittingly introduce major performance problems.
|Planning for virtulaization|
The three main benefits of virtualization are redundancy for data, increased flexibility in using storage address space and boosting performance for applications that tend to have I/O bottlenecks.
Virtualization, of course, can be done in multiple layers in the storage area network (SAN)-the volume manager, HBA RAID controller, network device (appliance) and subsystem controller. The performance-improving suggestions I make in this article should be done at the virtualization layer closest to the disk drives.
Redundant data protection in the form of mirroring (RAID 1) or RAID 0+1 is widely available and should be incorporated into all system storage. Cost advantages of RAID 5 are not viewed as significant enough to overcome write performance penalties and degraded-mode operations. The choice between RAID 1 and 0+1 depends on the capacity, scaling and performance requirements of the application. If in doubt, use RAID 1 to simplify configuration and troubleshooting.
RAID 0+1 arrays can afford to have multiple disk drives fail, as long as they are not part of the same mirror-stripe. This is a big advantage over RAID 5 and RAID 1, where the loss of more than one disk drive results in a loss of data.
Storage address spaces
In addition to mirroring and striping, virtualization can also subdivide and concatenate storage address spaces. From the perspective of address space manipulation, virtualization can make all disk storage work like putty that can be merged in an endless variety of ways. However, taking this approach to storage is unlikely to result in optimal designs.
It's important to always keep in mind that disk drives are electro-mechanical devices that are performance-constrained by the rotation speed of the media and the time it takes to move the read/write heads over the media. The performance and cost differences between 5400 RPM ATA disk drives and a 10,000 RPM SCSI disk drives can be enormous. As a best virtualization practice, make sure that the drives that form an array have similar specifications, so as not to create a bottleneck.
Additionally, you need to know how disk subdivisions are allocated to applications and systems. For example, consider a scenario where 14 different file systems and/or databases are using the storage resources of 16 disk drives that are managed by a storage virtualization product. There are many ways this storage could be allocated, but one simple "cloud" method is to assign units of storage uniformly on a first-come, first-serve, round robin basis. The "Array distribution diagram 1" (Allocating disks: there's the wrong way ...) shows a collection of disks as you might find within a disk subsystem where all 16 disks have been subdivided into five equal extents (partitions). As each application comes online, an array or extents is allocated to it using either RAID 1 or RAID 0+1.
According to the first diagram, array 1 runs on disks 1 through 4, array 2 runs on disks 5 through 8, array 3 runs on disks 9 through 12 and so on. The other dimension to analyze is the distribution of applications-represented by arrays-on each disk. For instance, disk 12 has arrays 3, 9 and 13 and disk 13 has arrays 4, 9 and 14.
This was first published in November 2002