The world of big data has been growing over the last few years. Everywhere you look today you see people talking about big data and discussing how their organization is doing (or planning to do) big data.
While I’m all for organizations adopting big data and rolling out big data initiatives, I tend to cringe a bit when I learn that many organizations are spending money on big data projects without taking the time to develop a strategic plan for how they will approach their projects.
As part of that strategic planning process, organizations need to consider how they will collect, store, analyze and use that data. What technology platform will they use? Will they try to use their existing infrastructure or build/buy new infrastructure? Will they hire a host of data scientists or train their existing staff members to take on the role of data scientist?
These types of questions are important. Don’t get me wrong, they are often asked but, in my experience, they are rarely asked as part of the strategic planning process. Additionally, the discussions around technology and platforms need to go deeper than just the surface discussions that tend to happen.
The real discussions that need to be held are around what aspects of technology your organization will use and how you will use them. In the example of Hadoop, you’ll need to understand how various aspects of the Hadoop ecosystem will fit into your organization’s infrastructure and into your data center. Do you have the capabilities and skill sets to implement a full stack of Hadoop? Do you have the skill-sets to integrate your existing systems with Hadoop to ensure all issues regarding data quality, data federation and data visualization are addressed prior to actually committing large outlays of money.
With all of that said, there is something to be said for starting small. There’s no reason that an organization can’t spend a small amount of money to experiment and understand how to best ‘do’ big data. There’s no reason you can’t start with a small data set and some simple infrastructure and some basic principles to see what you can find. This experimentation can help organizations work through the various ‘gotchas’ that can exist as well as better understand the software, hardware and people requirements for larger, more complicated problems.
Prior to spending a large amount of money on big data projects, organizations need to take a step back and consider big data like they would any other large undertaking. Get a feel for the costs and requirements of undertaking big data projects. Get a good, strong strategic plan in place for your big data projects and then – and only then, start your implementations.
This post is brought to you by SAS.