Big Data Starts with Data Management

Over on the Obsessive-Compulsive Data Quality blog, Jim Harris recently wrote:

 While organizations of all sizes are rightfully excited about the business potential of using big data, this excitement needs to be balanced by acknowledging the business risks associated with not governing the ways big data is used.

Well said.

Many organizations have been caught up in the ‘hype’ of big data. The good thing – the hype behind big data is generally driven by real-world success from organizations using big data. That said, there are risks involved in big data projects.

There are risks on the input side (the data that you use) and risks on the output side when you don’t understand the context of the data you are analyzing. To be successful ‘doing’ big data, organizations need to understand the inputs and outputs of big data. Starting with data management will help mitigate these risks since a good data management approach allows organizations to keep data quality in mind from the beginning of a big data project.

Starting with data management and data governance helps you understand and ‘control’ your data and eliminate risks from the outset. Additionally, governance allows you to manage multiple aspects of your data including how/when data is collected, who has access to data and how your data is archived.

When approaching big data projects, many consultants and vendors will talk about many aspects. They’ll talk about the value big data can bring. They’ll talk about systems and analytical approaches. Some may talk about statistics and visualizations. Before you dive in too deeply into any of these necessary topics, make sure to ask these same folks what they are proposing for data management and data governance.

IBMThis post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet. I’ve been compensated to contribute to this program, but the opinions expressed in this post are my own and don’t necessarily represent IBM’s positions, strategies or opinions.

hit counter