Feature

Get top performance from database storage

Ezine

This article can also be found in the Premium Editorial Download "Storage magazine: Optimizing your enterprise database storage."

Download it now to read this article plus other related content.

Enter virtualization
Despite virtualization's steady march into storage systems, most vendors and users think that some control is still needed to fully optimize database access. Oracle's Hoke says, "One of the big bad things that can happen with virtualization is that you can have two tables related in your business application, with a high level of traffic in certain rows of those tables."

One Way to Dramatically Increase Database Performance

    Requires Free Membership to View

Question: "Have you come across some hard numbers and performance figures for the use of RAID (at any level) in the database environment?"

Securing PDAs with sensitive data - or access to it - should be taken seriously. Basically, you need to apply the same rigor to PDA security that you would to any other enterprise-class device, with a clear security policy and a uniform suite of security software. Consider the following:

Chris Poelker: Database application performance shouldn't only be looked at from an I/O standpoint, but first and foremost from a design perspective. For example, at one customer site, they were using SQL server in a three-tier Web application, where the Web pages were built dynamically from the database. The design used active server page scripts in the application tier to query the database for results and the results were then sent to the Web server to build the dynamic pages.

By moving to SQL - stored procedures at the database level, and tuning the indexing methods - and a slight schema redesign-performance improved by over 800%, all without even touching the back end I/O. Never throw hardware at a poor design as the return on investment is never as good as taking the time to rethink the original design.

Here's the some things to do for good I/O performance:

1) Use a lot of spindles. If you go to and look at the specifications on the designs to achieve maximum database performance, you'll notice they use hundreds of drives. By a lot of spindles, I don't mean large RAID sets. You should concatenate many smaller RAID sets into a single LUN either by using host software or LUN virtualization within the storage hardware, if it has that capability.

2) Use low latency paths to storage. Fibre Channel uses a packet-based network, which allows for a full duplex backend. On a SCSI bus, only one thing can happen at a time. Using dual-pathed Fibre disks, you can have a read command on one side and a write command on the other side of the disk being queued at the same time.

3) Spread the I/O load as much as possible. Use many HBAs in the server that are load balancing your I/O across all the paths. Tools like Veritas' Volume Manager or Solstice Disk Suite can help do this by creating a single LUN from many spindles on each path. By using 2Gb paths and four HBAs, you can get close to 800MB/sec to the LUN.

4) Use the proper RAID level for the log and database. Best practice here is log files should be on RAID 0+1 (RAID 10) and databases on RAID5 (RAID 0+1 can be used on the database for better performance if your budget allows for it).

Performance. Today's hard drives can achieve a sustained throughput of between 30MB/s and 50MB/s and 120 to 140 IOPS (I/Os per second) per spindle. The performance depends on the drive electronics, onboard cache size, rotation speed (15,000 RPM is better than 7,200 RPM of course) and maximum seek latency. If you partition your RAID set into small partitions, and locate your data on the outer cylinders, your performance will go up substantially. Doing this reduces the seek time for a particular partition. Now just add up the number of spindles to figure out your maximum IOPS. Figure 1680 IOPS using 14 spindles on one HBA, or 3360 IOPS using 2 HBAs to 28 disks doing 64,000 sequential I/O(random I/O will be less). Changing the block size of the transfer will either increase MB per second (throughput) or IOPS. Smaller blocks make IOPS go up, larger blocks make throughput go up.

These numbers are for direct I/O, meaning no cache in front of the drives. Adding cache can increase performance dramatically. I have seen 33,000 IOPS on a single LUN during 100% cache hit. So, if you want the best I/O performance available, use a solid-state disk.

Christopher Poelker is a storage architect at Hitachi Data Systems. He answers readers' questions on "Ask the Expert" at
SearchStorage.com.

According to Hoke, a good database administrator will try to spread those tables across multiple devices, so he doesn't create hotspots, and also for availability. However, with virtualization, those two tables may be inadvertently placed on the same disk, because virtualization masks physical locations. "Virtualization is great for provisioning storage because it's easy to move data and add disks. But with a database, because of performance issues, there are potential drawbacks," Hoke says.

CreekPath's Booth agrees: "File placement is still absolutely important with a SAN, although disks have gotten faster. It still comes down to laying down data on a spinning disk, and not putting redo logs and indexes on the same physical drives. That's why it's important for tools to see all the way down to the spindle level, so that when they're laying down data it's not stepping on its own toes."

Marty Ward, director of product marketing at Veritas concurs, although he doesn't think administrators need to be directly concerned with where data is, "DBAs don't need to be considered about which spindle disk data lies on. We take care of that for them. Our logical volumes allow them to move data around on-the- fly to adjust hotspots, even while running." Other vendors are also announcing tools in this area.

Durnford sees things the same way. "Using JBOD, you're much more sensitive to placements [hotspots]. However, using storage arrays, we use their technology to move storage around while running to eliminate hotspots, so that the server doesn't really need to know the physical storage location," he says. Durnford uses built-in tools on his Hitachi, IBM, and EMC arrays to deal with hotspots on his volumes.

SAN issues
"In the old days of NFS [network file systems], due to lack of availability capabilities for Oracle, our customers saw storage corruption," says Oracle's Hoke. Because of those issues, Oracle's validation program for network-attached storage arose out of concern about the nature of NFS," he says.

In a SAN, storage just looks like a SCSI disk-reducing a number of the headaches that came with putting databases on network-attached storage, since the interface is identical to that of SCSI. Unlike NAS - with storage networks Oracle has left it to their storage partners to qualify HBAs - switches and front-end storage FC adapters. Hoke says that with a SAN, "We have not identified anything needing validation."

However, it still takes some work to get SANs performing right for databases. DBA Eisenschmidt - despite running a SAN for his applications and file systems -is skeptical of using SANs for his database storage. "Our SAN, like many I've seen, is optimized for storage capacity and file serving. While Oracle uses Xiotech's Magnitude in their test labs, I was pretty unhappy with the transaction performance we were able to get out of it."

CreekPath's Booth says, "The biggest issue with databases on a SAN is maintaining consistent throughput and access on a database while running other application sharing the same SAN."

Caribou Lake's Leo agrees, saying, "SANs often save an enterprise money by sharing I/O capacity across departments. But sharing I/O resources between the user files of one department and the DBMS storage for another is awful. The database management system's I/O characteristics are completely different than the user data files, and cache coherency becomes an issue."

Royal Bank of Canada's Durnford has opted for separate storage networks to solve some of these problems. He says, "For tape, we're looking at a separate tape SAN. The issue there is database traffic is very time-sensitive, where tape traffic is very backup-oriented. Though you might think it's time shifted, we've found there's overlap."

The other issue most DBAs mention is the complexity of managing storage networks, which are typically managed by separate storage infrastructure staff. "A lot of progress has been made in the last couple years, but most system administrators want SANs that are easier to administrate. I want a more reliable SAN whose caching I'm not afraid to turn on," Eisenshmidt says.

Leo says that when you put a database and storage together, "You realize that your SAN management team must work closely with the DBMS team. We have seen situations where SAN managers have decided to change the SAN configuration because it seems wasteful or odd, not realizing how carefully tuned the setup was."

CreekPath's Booth also agrees, saying that while careful tuning of how storage is configured can eliminate sharing issues, "Both DBAs and storage administrators need to be on the same page."

Despite the issues, some users have had great success running SANs with their databases. John Brooks, vice president of storage engineering at Cleveland, OH-based National City Bank, runs an 18TB SAN, which is based on Brocade switches with IBM and Hitachi arrays. "We have no issues with running on a SAN. Not even performance issues. It just seems to run," he says. Asked if he does anything special for databases, he says "For SQL and Oracle databases, we throw it on the SAN." Brooks attributes the lack of issues to the relationship he has with his storage vendors. He says, "We do a lot of research, primarily in the interoperability area that might account for the lack of issues. We work very closely with our storage vendors to ensure certification and supportability of the environment, in case issues do arise."

Tools and techniques
Because SANs' new role as the primary storage for databases, there's now a need for improved tools and software to manage the combination of database applications and storage. "It would be nice if vendors could provide some sort of management tool to track why a SAN partition was configured in a certain way, thus preventing the loss of historically significant details when SAN management changes hands," says DBA Leo.

"If I could have anything from a SAN to make it better for a database, I'd ask the vendors to write better drivers. Most of the drivers I've seen and worked with just tell the OS the drive is gone when the SAN disconnects, instead of perhaps telling it to wait or trying to free the process," Eisenshmidt says.

Some vendors seem to be listening and are increasingly focusing their efforts on getting their database to run better on SANs, and their storage and infrastructure to better support databases. Take, for example, Oracle's Hardware-Assisted Resilient Data (HARD) initiative (see "Oracle's storage push"). Oracle's Hoke explains, "We're working with hardware partners to verify that when Oracle data is transferred to a disk it can be read. This sounds simple, but with storage management and storage virtualization in storage networks, there are lots of cooks in the broth."

Part of Oracle's HARD initiative, EMC's Oracle Double Checksum tool offers the ability to verify Oracle data as it is written to arrays in microcode, ensuring that corrupted data is never written to disk. Competitor Hitachi Data Systems has also announced support for the initiative, with a microcode-based verification of Oracle data in their Lightning 9900V product.

EMC also offers its Symmetrix Optimizer, which makes recommendations to administrators on where to rebalance data-a must for administrators looking for hot spindles. EMC has a close partnership with Oracle, including a dedicated engineering team at Oracle Redwood Shores. Chuck Hollis, EMC's vice president of products and markets says, "We have 12 to 15 engineers inside Oracle, some dedicated to interoperability, and the rest to feature exploitation such as our Double Checksum product and integration between Oracle and our replication products."

Veritas also offers a host of products designed to work with databases and SANs. The company's Database Edition products for Oracle9i, IBM DB2, and Microsoft Exchange, including mirroring, backup, replication and clustering features designed for those databases. According to Veritas' Ward, an upcoming release of the company's storage resource management product, "will map data all the way from a database environment in Oracle or Exchange, through files, all the way down to disks to show you which spindles and extents the data is written to."

BMC offers its PATROL for Storage Management product. BMC's Rogers says, "When you're working with Oracle or Sybase or Microsoft SQL server and even Exchange, we can expose that from a database perspective. For example; we can expose an Oracle instance that is on a Solaris box and attached to a SAN, and see where it is spread about. We can expose that through a topology manager-all of the connections, single points of failure, and the way the database is laid out on disk."

Despite complexities of managing databases running on storage networks, the benefits are clear. For large databases, there's really no better option that a SAN because of the level of performance, consistency, redundancy, and scalability that a storage network provides. Storage and database administrators alike are finding that the only way to support their mission critical applications is to deploy their databases on a storage network. With so much at stake, storage and database vendors will continue to work more in tandem to make their products work better together (see "Vendor independence: Are the trade-offs worth it?").

This was first published in November 2002

There are Comments. Add yours.

 
TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to: