First, I need to define constraints before we dig into the meat: What I consider a small to medium-sized business (SMB) is a company that would have a problem justifying a $50,000 purchase for a product that would perform a migration then have no use for it for 3 to 5 years until they migrate again, or have one to two IT people doing the work, or think a SAN is just a typo for SAN-D that you’d find at a beach. I know IBM, Sun, Symantec et al. have migration services but I’m looking at the smaller business space where people need to store more on tighter budgets that were small to begin with.
We’ve recently upgraded our SAN infrastructure and while our data migration chores aren’t all that intense, I’d still prefer that a computer did it. I’ve built some tools to handle my cleanup work (I’ll share them as soon as some bugs are worked out) but only because I couldn’t easily buy something to do the same or better. Now I’ll admit that sometimes I can be blind or ignorant (or both), but I’ve noticed a HUGE gap in the availability of migration tools for the lower end of the SMB spectrum. With me being a part of The Matrix like I am, or akin to Mr. Universe from Serenity, one would think I’d have caught a whiff of something significant.
With all the hoopla about the structured and unstructured data needs of SMBs and vendors tripping over themselves to create products for this space (HP’s new MSA and NetApp’s StoreVault come to mind), it’s hard for me to swallow that data migration for that very same SMB space wouldn’t be just as hot.
If you sense some irritation on my part, you’re spot on. While it may not directly affect me, it most certainly does affect those who rely on my expertise. I’d like to be able to say “Hey, just get Product X from Vendor Y and it will take care of at least 60% of your migration woes. ”
Bigger irritation: From my admittedly unscientific data gathering, I’ve found that when I ask SMBs about migration, the most popular answer is backup software. “Well, we were planning on just restoring the server to a folder on the array. Why? Is that bad?”
Most times the answer is yes, it is bad. In my presentation at Storage Decisions I talked about what’s actually stored on your centralized storage and file shares, and most of it is stuff that shouldn’t be there, like MP3s, AVIs, etc. By restoring backups to the new array, they would simply be carrying forward a wasteful, and potentially harmful, data storage practice.
Furthermore, if you subscribe to the numbers from major analyst houses, you wouldn’t be able to restore 30% to 40% of the data you had on backups anyway, so relying on them for migration could be an exercise in futility. My own unscientific polling has come up with a similar amount of people that say, “We can never pull anything back we really need in a pinch, unless you count the i386 directory.”
I’m sure there are a couple of you out there thinking, “This guy is losing it. Why can’t he just create a filter for his backup software and have it selectively restore said data, or, better still, create a simple shell (bash, csh, powershell, vbscript, jscript, insert the rest here) with some regular expressions to handle that work. He could even stick something into Visual Basic or managed code (ducks the rotten tomatoes) that does the work!!
While I (or the people I work with) could certainly do all those things (with the help of Google and copious amounts of coffee), we aren’t the targets I’m talking about here. The people I’m talking about, who rely on the expertise of people like me or of consultants, cannot — or in some cases will not — for their own reasons. Homegrown apps also require a deep level of knowledge of the environment you are crafting them for, and you can see how it would be a difficult proposition for one person once you grew to 15 or 20 different small environments simultaneously.
Because of my own issues of dealing with not only illegal/unwanted data but depreciated legacy data that can be deleted and need not be dragged around for every iteration of a data storage infrastructure, I have come up with a simple and reliable data classification scheme. I should be able to ring up vendor A or B and say, “Give me something that will take my criteria and email me when it’s done moving and verifying my stuff, but DOESN’T cost $50,000.” There are cheap tools out there that will transfer file and NTFS permissions, even shares and share permissions. But, while very helpful, they don’t address the issue of the data itself.
I don’t live in an ivory tower, nor am I a strict idealist, but this topic does aggravate me. If companies can figure out how to turn one reasonably powered computer into 20 adequately powered ones for free, or have my economy car listen and respond to me intelligently when I want to listen to the Black Eyed Peas, why can’t I have a data migration tool that doesn’t cost as much as an SMB’s quarterly gross income?