There’s been talk for a while. I’ve left clues around the office and said too much on occasion. Suspicions have been raised. Finally, I’m willing to admit there’s something going on, or at least I can no longer ignore the frisson of excitement that tickles me when I find myself near a data centre. There she is, all cooled and sparse. I can’t keep this to myself any longer.
The country of origin of a widget, how many miles from your premises it starts its journey, how long it takes to unwrap and discard the packaging (de-trash is the catchy insider term) and the failure rate of that widget. I used to write databases for a living and this was the sort of data I thought about, looked at, managed and compared. It wasn’t big data as we know it now, but it was tens of millions of pieces of information.
Now, I reflect on what we could have deduced and changed if we’d been able to overlay additional data, for example the weather data from the factories of origin on specific shipment days? Did humidity play a part in the failure of a widget? Could we have advised a factory owner that he needed dehumidifiers on from May to September? That was millions of pieces of data. Our capacity in technical disciplines now is advanced enough to reveal more than millions or billions of pieces of data in single encounters. Medical imaging for instance – or science like the Large Hadron Collider present data sets so vast that only now, science has the capacity to receive and process the output of this sort of research: worthwhile endeavours with societal and global benefits but the proliferation of mobile devices with GPS and social apps has created billions of co-ordinates and interactions that create a big-data set of commercial and governmental interest.
From analysis of when people are likely to Tweet – to what they are likely to post about on Facebook after 10pm at night – to cumulative data on clusters of a specific demographic and the associated impact on public transport. WIth access and input to to big-data, litter will be cleared and traffic lights will change and venture-funds are already limbering up and fortunes will be made as old-hands and sharp start-ups figure out how to analyse buying habits and trends, guiding brands to an eager buyer, place or demographic.
“Listening to the data is important… but so is experience and intuition. After all, what is intuition at its best but large amounts of data of all kinds filtered through a human brain rather than a math model?”
— Steve Loh
Administrations globally have taken heed of The World Economic Forum’s view that big-data is a new asset in economic terms, not unlike gas or oil. No doubt there will be skirmishes about who owns data and who gets access. Hopefully, information socialism will win out, and as with Open-Source, sense will prevail; big-data will be as transparent and as accessible as the internet itself.
On TWiT last week Leo Laporte remarked that facts ‘have been commoditised’ – a comment that illustrates how we as a society no longer value facts as we once did, but rather it is the interpretation of many related and unrelated facts that is becoming valuable. As big-data as an industry matures we will undoubtedly see stumbles by companies and people that we’ll call the Apophenians. Misinterpretation will be wide-spread for a while as commercial pressures demand sellers and buyers of analysis to find something sellable so perhaps the best apprenticeship for the commercial big-data industry is to watch and follow the greatest exponents of epidemiology who have been doing this sort of thing for quite a while.
A big data project underway in Helsinki, the Helsinki Region Infoshare (HRI) and the surrounding region is a particularly encouraging sign of openness in government giving way to open society with the bonus of benefit to an entrepreneurial community. One of the stated aims of the HRI is:
Get started with big-data
- Big data : Wikipedia
- Data Catalogat World Bank
- Fingal County Council Open Data
- Oireactas Bills 1997-2010 Data Set at opendata.ie
- UK Internet Access Data Set @ data.gov.uk
- Big Data Conferences & Events