NetApp and other storage vendors have introduced products that incorporate Hadoop; are these single-use systems or can they be used for other apps (databases, user shares and so on)?
This is a complex question with numerous approaches. Knowing that implementing Hadoop as an analytics platform is rapidly gaining in popularity, especially at the enterprise data center level, many storage and storage-related vendors are climbing on the bandwagon to offer ways to improve Hadoop in production data center environments. One way is to use shared storage as a large repository of Hadoop data for data protection, archive, security and data governance purposes. Another is to use high-performance storage directly attached to Hadoop data nodes to allow data storage to scale without adding more data nodes to the cluster, while maintaining the required performance characteristics of each data node. Independent software vendors are addressing the issue of getting data into and out of Hadoop quickly and from multiple data sources. A growing list of vendors (both systems and storage vendors) are incorporating Hadoop into preconfigured products. The intent is to give the user an easier way to benefit from Hadoop-based analytics without the do-it-yourself aspect of early Hadoop implementations. There are also ways to incorporate Hadoop within a storage architecture, most likely as a distributed storage cluster. Expect to see this sphere of vendor offerings and technologies grow.
This was first published in November 2012