Editor's note
Hadoop technology has been discussed hand in hand with big data for some time now, but IT professionals still don't know the full extent of what the technology can do or how to use it.
The open source Hadoop framework is based on Google's MapReduce software and can process large data sets at a granular level. It offers analytics at a low cost and high speed that some analysts say can't be achieved any other way. Essential to the effectiveness of Hadoop is the Hadoop Distributed File System (HDFS), which allows parallel processing by spanning data over different nodes in a single cluster and provides fault tolerance.
However, HDFS is the source of one of the main issues users see with Hadoop technology: expanded capacity requirements due to Hadoop storing three copies of each piece of data in case a DataNode fails or is taken offline. That failover setup is necessary because each NameNode that controls the copy and distribution process of data is a single point of failure. Other complaints point to the complicated technology stemming from Hadoop's Java framework.
Despite the hurdles with Hadoop technology, analysts and users say the benefits are worth it. To help you determine that for yourself, this guide will walk you through the basics of what Hadoop technology can achieve, lay out the main concerns about the technology, and outline how it works with storage and the cloud.
1Dealing with Hadoop pain points
Despite its popularity, criticism of Hadoop ranges from the requirement for a specialized skill set to several single points of failure in the Hadoop cluster. In the following links, you'll find explanations of these and other Hadoop issues, and learn how to confront them.
-
Article
Hadoop technology creates problems for big data analytics
Described as cutting-edge, hot, niche and hard to use, Hadoop, like all celebrities, has its shining moments and dismal displays. Read Now
-
Article
Anticipating the results of an HDFS infrastructure
Analyst John Webster details issues with Hadoop technology and what users can expect from Hadoop Version 2.0. Read Now
-
Article
Dealing with problems in Hadoop and MapReduce
Big data users are facing challenges when using Hadoop and its MapReduce programming model. But taking some good first steps can help avoid problems. Read Now
-
Article
Why Hadoop isn't essential to big data
Expert Jon Toigo explains why Hadoop technology and big data are frequently used together, but argues that Hadoop has a number of downfalls. Read Now
2Understanding Hadoop technology and storage
Because Hadoop stores three copies of each piece of data, storage in a Hadoop cluster must be able to accommodate a large number of files. To support the Hadoop architecture, traditional storage systems may not always work. The links below explain how Hadoop clusters and HDFS work with various storage systems, including network-attached storage (NAS), SANs and object storage.
-
Article
The effect of Hadoop technology on storage
Storage Switzerland analyst Colm Keegan explains how to determine whether a SAN or NAS should be used as primary storage with Hadoop. Read Now
-
Article
Can Hadoop technology be used with shared storage?
Storage expert John Webster discusses three ways to use shared storage with Hadoop technology in this Ask the Expert answer. Read Now
-
Article
Benefits and challenges when using Hadoop clusters
Brien Posey explains how Hadoop clusters can be extremely beneficial to large amounts of unstructured data -- but they aren't ideal for all environments. Read Now
3How Hadoop technology works with the cloud
Hadoop can be useful for analytics across cloud storage because of its parallel-processing capability. Because Hadoop can process data across many servers, large amounts of data stored in the cloud can be searched and analyzed at high speeds. From the links below, you'll learn how using Hadoop in the cloud works and how cloud storage can help address some common Hadoop problems.
4Experts discuss Hadoop technology
Now that you have a better understanding of how Hadoop technology works with big data, watch the videos below to get experts' takes on how well Hadoop works and the best ways to use it.
-
Video
Webster on storage for a Hadoop Cluster
Of all the ways to handle the storage requirements of big data analytics, Hadoop technology is receiving the most attention. Find out why. Watch Now
-
Video
Understanding HDFS and NameNode
Learn why you should be on the lookout for issues with NameNode and HDFS when using Hadoop technology for big data storage. Watch Now
-
Video
Benefits of using a Hadoop architecture in big data environments
In a video interview, TechTarget's Wayne Eckerson discusses the benefits and challenges of deploying Hadoop-based systems in big data environments. Watch Now
-
Video
Storage alternatives for a Hadoop infrastructure
Hadoop storage systems traditionally call for the use of embedded DAS for hardware-based storage within the Hadoop MapReduce framework. But alternatives exist. Watch Now