In what might be the best titled article I’ve read in a while, Vince Kellen writes about the dangers of confirmation bias (or ‘finding what we want to find’) when working with big data.
The article, titled “Big Data and the Mirror of Erised“, Kellen writes:
Being human beings with a tendency to confirm what we so want to have happen or to relive what felt so good in the past, managers often drift into self-sealing and circular analysis that at first doesn’t seem so wrong. Big data has to poke through the subtle and instinctual responses of data denial.
The ability to ‘poke through the subtle and instinctual responses of data denial’ is one of the differences between ‘good’ data science and ‘poor’ data science. The ability to show people why their bias’ are wrong differentiates a good data scientist from poor one.
Fighting confirmation bias in data science is an ongoing issue that must be addressed in every aspect of data analysis. To make matters worse, just about every person within an organization has some sort of bias that has the possibility of effecting data analysis within the organization.
Kellen writes that “much stands in the way between data and a good decision, most of that being human nature. If organizational leaders can develop a collective, emerging, and self-guiding process for analyzing data, anything is possible.”
Now…I’m not sure what a “collective, emerging and self-guiding process” is exactly, but it sounds good. I think what he’s arguing for is that organizations should open up their data to everyone within the organization and then allow that data to speak for itself as much as possible, the possibility of confirmation bias can be reduced.
Kellen provides some interesting ideas to fighting the bias issue. I highly recommend that you jump over to read his entire article.