Access your Pro+ Content below.
External storage might make sense for Hadoop
This article is part of the February 2014 Vol. 12 No. 12 issue of Storage magazine
Using Hadoop to drive big data analytics doesn't necessarily mean building clusters of distributed storage -- good old external storage might be a better choice. The original architectural design for Hadoop made use of relatively cheap commodity servers and their local storage in a scale-out fashion. Hadoop's original goal was to enable cost-effective exploitation of data that was previously not viable. We've all heard about big data volume, variety, velocity and a dozen other "v" words used to describe these previously hard-to-handle data sets. Given such a broad target by definition, most businesses can point to some kind of big data they'd like to exploit. Big data is growing bigger every day and storage vendors with their relatively expensive SAN and network-attached storage (NAS) systems are starting to work themselves into the big data party. They can't simply leave all that data to server vendors filling boxes with commodity disk drives. Even if Hadoop adoption is just in its early stages, the competition and confusing ...
Access this Pro+ Content for Free!
Features in this issue
This "Sweet 16" roster of storage products represents the leading technical innovation of the past year.
Don't make your DR planning process harder than it is by trying to do too much or cutting corners. Careful planning is key to a successful recovery.
There are two sides to the big data story: analytics using vast numbers of small files, and dealing with storage for really big files.
Our latest survey charts the storage architecture alternatives readers are using in their storage shops.
Columns in this issue
Cloud closures, flash-in-the-pan solid-state vendors … storage might seem a little more dangerous these days, but it just might be innovation at work.
Filling drives with helium doesn't advance the art of hard disk design, it just makes it possible to stuff more old tech into a new package.
There aren't many reasons not to virtualize your servers, but there are plenty of compelling data protection reasons to virtualize them all.
Using Hadoop to drive big data analytics doesn't necessarily mean building clusters of distributed storage; a good old array might be a better choice.