Problem solve Get help with specific problems with your technologies, process and projects.

ILM, ISI keys to real-time data warehousing

Expanding volumes of data are a challenge to those trying to implement real-time data warehouses and business intelligence systems. ILM and ISI help make these systems manageable.

What you will learn from this tip: Ever expanding volumes of data make it harder than ever to "find the needle in the haystack." This is a particular challenge to those trying to implement real-time data warehouses and business intelligence systems. But analysts say emerging technologies such as intelligent search and indexing (ISI) and implementation of information lifecycle management ( ILM) should make it possible to master this challenge.

With more and more organizations seeking to provide on-demand data warehousing services and real-time business intelligence (BI), storage organizations may find themselves in a squeeze between the need to beef up technology and the need to control costs.

Phil Russom, an analyst at the Data Warehousing Institute, says many companies are drowning in so much data that even though SAN and NAS systems are relatively inexpensive, it is growing more difficult for them to keep up. Russom says that one element of a possible solution to this problem is better management of the complete information lifecycle. "Many companies do a poor job of deciding what data to keep and what data to get rid of," he says.

"Storage arrays are huge, and we have tons of them, so finding something is like finding a needle in a haystack," says Steve Duplessie, founder and analyst at the Enterprise Strategy Group. And, it's even worse when you aren't sure what you are looking for, as is often the case with BI searches. Duplessie says if we knew how to index everything in our environment appropriately, and could even look into the content itself, we could more easily answer most BI questions.

There are also opportunities to use technology more appropriately. Russom says "spinning disk media" is the most expensive way to store data when compared with optical media and even traditional magnetic tape. Because of their costs, some organizations are now reemphasizing the importance of these alternatives to spinning disk. "They have terabytes and petabytes of data that they simply can't maintain just on magnetic disk." he says.

Indeed, he notes that storage doesn't have to be live to be useful -- it can be maintained in "nearline" devices such as optical storage and offline on tape. The conundrum, though, is that for all practical purposes, if you want to access information it needs to be on disk. A solution, says Russom, seems to be emerging with indexing capabilities that make optical disk and tape behave something like a relational database. Some systems (Russom cites the example of FileNet) even let you query offline tape. "With these systems you won't get answers instantly -- it may require an administrator to mount a tape similar to the process with old IBM 360 systems -- but you will be able to find and access low-value data relatively quickly," he says.

Duplessie also sees merit in solutions that can streamline on-demand data warehousing services and real-time BI through better search capabilities. He says he is especially enamored with ISI. "Regardless of the application, as unstructured data continues to grow unabated, finding things is becoming more and more impossible," he says.

But don't hold your breath. "We want to be 'on-demand,' but we simply can't be with today's technology," says Duplessie. "Storage devices and the infrastructure needs to get much, much smarter about what the content is, where it is and what it might be relevant too," he says.

For more information:

Bye, bye RAID?

About the author: Alan Earls is a freelance writer in Franklin, Mass.

Dig Deeper on Data storage management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.