I like Data as much as the next guy, but I am troubled by all the empty promises of big data. I think data is good but too much data is unwieldy and it becomes hard to separate the noise from signal. It does not matter how sophisticated our data sifting mechanism become it is still ugly. What worries me even more is our inherent ability to see patterns in tea leaves that could lead to hypothesis and theorizing based on Big Data. Nicholas Nassim Taleb in his book on Antifragility talks about this problem:
In business and economic decision making, reliance on data causes severe side effects – data is now plentiful thanks to connectivity, and the proportion of spuriousness in the data increases as one gets more immersed in it. A very rarely discussed property of data: it is toxic in large quantities- even in moderate quantities.
… noise and randomness can also use and take advantage of you, particularly when totally unnatural, as with the data you get on the Web or through the media.
The more frequently you look at data, the more noise you are disproportionally like to get (rather than the valuable part, called the signal); hence the higher the noise-to-signal ratio. And there is a confusion which is not psychological at all, but inherent in the data itself.
The world of Twitter and capturing every search by Google, we are fooling ourselves into believing that more data is better actually it is worse given the frequency of our observation. Maybe if we can spread the time variable out it could elicit some interesting dynamics. Anyways, it has been a lot of fun to read Antifragility the last couple of days… and fortunate to be in a place that is tranquil and peaceful, where inspiration is easy, all I have to do is look outside the window or walk out the door. I am in our summer house in Thingvellir, surrounded by mountains, lakes and glaciers and just pure nature, Icelandic Style.
Here is a troubling consequence of big data: Surveillance
- The hype, benefits, and dangers of Big Data (elezea.com)
- There’s no such thing as big data (slideshare.net)
- Catherine Becker: You Can’t Live on Big Data Alone (huffingtonpost.com)
- 3 reasons why ‘big data’ can often be meaningless or misleading (smartplanet.com)
- When Analyzing Big Data Goes Wrong (healtheconblog.com)
Very energetic article, I enjoyed that bit. Will there be a part 2?
Pingback: Cost of Research | Startup Iceland