When I talk to people and companies who are just starting out in big data, I usually hear something about Hadoop. I’ll hear things like “If we are going to get into big data, we’ll need to implement Hadoop” or “we don’t have any Hadoop experience so we’ll need to gain that skill before we get into big data.”
At some point in the recent past, Hadoop has become synonymous with big data, which is a bit disconcerting. Hadoop is not a requirement for big data nor is it a required skill for anyone trying to break into big data. Sure, Hadoop is very helpful in combining and analyzing large data sets but it isn’t something that you must have to ‘do’ big data.
When I hear people talk about the need to learn or implement Hadoop before they can do anything related to big data, I always tell them to ignore Hadoop – for now. Take on some small data analysis projects before implementing new systems. Make sure your strategy is sound first, then worry about how to implement that strategy
Now, you may think that I dislike Hadoop. I’m actually a huge fan of the Hadoop platform and believe that it should be at the top of the list of platforms for every organization to consider. Hadoop is a major component of most big data initiatives, which is one of the drivers behind people automatically thinking of Hadoop when they think of big data.
Hadoop has become so popular because it provides an effective and efficient architecture to store all types of data, scales very easily and allows queries and analysis to be performed on that data. Hadoop provides solutions to many of the problems that face organizations when working with large data sets. Hadoop provides functionality to address, data integration, data visualization, in-memory analytics, interactive analytics and in-database queries.
Hadoop gives an organization a great platform to build big data processes and analytical approaches off of. There’s a great deal of value to be found in using Hadoop. In fact, there’s so much value that companies like SAS have built functionality to take advantage of Hadoop in-memory analytics capabilities to make use of the data and infrastructure that many organizations already have in place.
While you don’t need Hadoop for big data, it’s a great fit for big data. If you want to get into very large data sets and use cutting edge platforms and systems to analyze your data, Hadoop will give you the underlying platform for your big data initiatives.
This post is brought to you by SAS.