More data means better decisions and smarter, more efficient processes. Right?
Isn’t that the point?
It’s an intriguing premise. And one it appears most people tend to believe. Just combine massive quantities of data with the requisite analytics and computing power and you get the right solution to the problem. Or the best answer to the question.
Microsoft Research’s Kate Crawford refers to it as big data fundamentalism. The unquestioning faith that more data automatically means better decisions.
In Why Big Data is Not Truth, Quentin Hardy describes it this way. “The term [big data] suggests assembling many facts to create greater, previously unseen truths. It suggests the certainty of math. That promise of certainty has been a hallmark of the technology industry for decades. With big data there are even more hazards, some human and some inherent in the technology.”
Put another way, “Big data can reduce anything to a single number, but you shouldn’t be fooled by the appearance of exactitude.” It’s just one of the concerns that Gary Marcus and Ernest Davis identify in Eight (No, Nine!) Problems With Big Data.
Take the Google Flu Trends tool, which the authors of The Parable of Google Flu: Traps in Bid Data Analysis have likened to a Dewey Beats Truman moment. Out of the nearly half a billion searches Google processes each day, it claimed to have discovered a close relationship between flu-related searches and actual occurrences of the flu.
According to Google, “By counting how often we see these search queries, we can estimate how much flu is circulating in different countries and regions around the world.” Well, it turns out that it’s one thing to identify correlations in big data. It’s another altogether to use those correlations to make accurate predictions or determine causes. Google Flu Trends dramatically overestimates flu cases when compared to actual flu data.
Don’t misunderstand, there are many problems and applications for which big data can and does provide the right solution. And the list is growing as our analytics and experience with big data evolve.
But there are also problems that just seem to defy big data computing solutions. For these problems, the key to better decision-making may just lie in the human element. In other words, the solutions to these problems require people to discern meaning and provide essential insight, and to recognize what the data is or is not telling them.
At least that’s the conclusion of data mining innovator Shyam Sankar. Named among CNN’s top 10 Thinkers for 2013, he argues that there are problems for which you can’t just algorithmically data mine your way to a solution. Problems like adaptive adversaries and unrepeatable problems buried in big data, where solutions require a blending of human insight and creativity with sophisticated algorithms and raw computing power. For solutions to these problems, he advocates a form of symbiotic relationship between human and computer.
Check out Shyam Sankar’ popular TED Talk, The Rise of the Human-Computer Interaction for more on the value of human insight.
If you’re intrigued by the possibilities and the challenges of big data, join us Wednesday evening, May 14 for our Evening Forum program Big Data, Small Things, Predictive Analytics.