The world of big data has been growing over the last few years. Everywhere you look today you see people talking about big data and discussing how their organization is doing (or planning to do) big data.
While I’m all for organizations adopting big data and rolling out big data initiatives, I tend to cringe a bit when I learn that many organizations are spending money on big data projects without taking the time to develop a strategic plan for how they will approach their projects.
As part of that strategic planning process, organizations need to consider how they will collect, store, analyze and use that data. What technology platform will they use? Will they try to use their existing infrastructure or build/buy new infrastructure? Will they hire a host of data scientists or train their existing staff members to take on the role of data scientist?
These types of questions are important. Don’t get me wrong, they are often asked but, in my experience, they are rarely asked as part of the strategic planning process. Additionally, the discussions around technology and platforms need to go deeper than just the surface discussions that tend to happen.
For example, say that your company will use Hadoop for your big data projects isn’t enough. Hadoop is much more than just one platform; it is an ecosystem.
The real discussions that need to be held are around what aspects of technology your organization will use and how you will use them. In the example of Hadoop, you’ll need to understand how various aspects of the Hadoop ecosystem will fit into your organization’s infrastructure and into your data center. Do you have the capabilities and skill sets to implement a full stack of Hadoop? Do you have the skill-sets to integrate your existing systems with Hadoop to ensure all issues regarding data quality, data federation and data visualization are addressed prior to actually committing large outlays of money.
With all of that said, there is something to be said for starting small. There’s no reason that an organization can’t spend a small amount of money to experiment and understand how to best ‘do’ big data. There’s no reason you can’t start with a small data set and some simple infrastructure and some basic principles to see what you can find. This experimentation can help organizations work through the various ‘gotchas’ that can exist as well as better understand the software, hardware and people requirements for larger, more complicated problems.
Prior to spending a large amount of money on big data projects, organizations need to take a step back and consider big data like they would any other large undertaking. Get a feel for the costs and requirements of undertaking big data projects. Get a good, strong strategic plan in place for your big data projects and then – and only then, start your implementations.
This post is brought to you by SAS.
2 responses to “Before Jumping into Big Data, Know Where and How You’re Jumping”
Nice post, but I am not a fan of the term ‘big data project.’ Think that Amazon, Apple, Facebook, Google, Netflix, and others view Big Data this way?
I’m sure they don’t think of big data projects…in fact, they most likely view big data as just something that they do as part of their operations.
That said, most organizations aren’t Apple, Amazon, Facebook, Google or Netflix. Many organizations must view big data as a ‘project’ initially since the really don’t what they are doing or how to get started. Using the term ‘big data project’ has helped organizations get their processes lined up to begin working with big data in my experiences.