It is about time that we, as an industry, learn how to properly save data. It's not the fault of you (the customer), but more of a problem related to how most data management tools log information about what is being saved to disk and tape, and when it was save. It has also increasingly become an issue for customers needing to track where specific files and volumes were last saved and locate these pieces of data quickly.
Today, that can end up being a monolithic task. Especially, if you need to track the version of the data that was originally corrupted in order to recover core business data. But, this may be changing.
A warning: The rest of this column looks at some early-stage technology that could really help the situation described above.
If you are trying to find a lot of vendors who can do this today, you will need to wait -- but it is coming quickly with several innovative startups. This year might be the year that data management takes on a new dimension - something I will refer to as "rapid recovery" technology, or forensic data.
This technology seeks to solve a problem that has been around for a number of years: Namely, organizing how data is saved, tracked and preserved in a way that it can be easily accessed.
Why is this important? Business continuity and compliance. In fact, compliance with a plethora of government regulations has changed IT priorities about the tagging of data so you know what the data is, who owns it, and most importantly,
For example, in the course of doing a backup, let's say that you discover the database or file you are backing up has been corrupted, and you need to roll back to a previous, uncorrupted version. You end up having to remount the tape, search for the data, shift the data back onto the server and reboot the application.
With rapid recovery, all data changes are documented as they are written to disk or tape. The time of the change is also recorded. At least, that appears to be the promise. When corruption is identified, the data can be rolled back to the time just before the data become corrupted.
Traditional data protection software might be able to provide a number of views of the data -- either snapshots or point-in-time copies -- but not to the level of detail or time increments that this technology could potentially deliver.
To clarify, this "rapid recovery" method will not replace traditional backup and recovery software, but will complement it where the pain threshold warrants this kind of protection. The vendors offering products here include a number of startups, such as Revivio and FilesX, as well as data protection market player StorageTek.
That is not to say other vendors won't consider incorporating this kind of technology over the next 18-24 months -- and expect this to be a competitive feature for the next three years. This is also not the same core technology as the emerging datastore and consolidated backup appliances, although both could take advantage of this technology longer term.
What should you consider when evaluating this technology in the short- and medium-term? Here are a number of features to think about:
- Where does the rapid recovery software reside? Today, this software is most likely found on a dedicated storage appliance or server connected to the storage network that monitors the traditional data protection software. It will also likely be found in hosts and storage arrays longer term as it takes off.
- What kinds of data are supported? Mileage will vary on what kinds of data are tracked through these management tools: it could be database, volume, file, or block.
- How is the data tracked? Some of the products on the market today use a journal approach to store metadata; others will use simple directories; and, a smaller but growing group uses actual databases. (The use of actual databases is the preferred way, longer term, to allow the export of information to third-party products). Time measurements also vary from product to product.
- How well is this integrated with other storage management/data protection products? Again, mileage varies here, but the goal should be to make sure the product you consider deploying has strong integration with both existing, big vendor backup and restore products as well as the new class of disk-to-disk data protection platforms already offered.
- How efficient is this technology? The idea is to reduce the number of copies needed for each data set. The technology should be significantly more efficient than traditional data protection approaches. A select few of these platforms also offer compression techniques to reduce the overall footprint of the data. Choose vendors who can guarantee no data corruption when using compression.
About the author: Jamie Gruener is a senior analyst, focused on the server and storage markets for the Yankee Group, an industry analyst firm in Boston, Mass. Jamie's coverage area includes storage management, storage best practices, storage systems, storage networking and server technologies. Jamie answers reader questions related to storage management issues for SearchStorage.com.
Do you want to see more articles by Jamie Gruener or insights from other noted industry observers? Visit the complete Bits & Bytes column library.
This was first published in June 2003