Mangling a title there from Lance Armstrong. Who, it may be remembered from his book “It’s not about the bike”, made the suggestion that it wasn’t the equipment he was pedalling that made him so successful. The title’s inference of course, being that it was Lance’s special qualities, rather than his ride’s. That book was published in the pre-Oprah interview days. Post-Oprah we learned exactly what some of those special qualities were, but the point remains the same. When we see something new and exceptional, we look at the bits. The equipment. The wheels, the levers, the chains and pulleys, the big data centres, the huge power supplies, the endless racks of hard disks.
There’s some fun in that, the technical puzzle, for certain. Even in something as simple as a bicycle. Over the years the solution to the bicycle puzzle has largely settled on the diamond-shaped safety frame with rear wheel drive but it wasn’t always so clear.
For big data, the puzzle is more complicated and the approaches more varied. The first time I ventured into a large cloud-computing company’s datacentre, I felt like Keanu Reeves’ character Neo when he woke up in the battery farm in the Matrix.
I remain interested in the “how”. I spent a bit of time building big-data cloud infrastructure to better understand it too. I can with some relief report that there are no slime-covered humans in pods powering Amazon Web Services datacentres. In the management offices one cannot be so certain .
Beyond how it all fits together though, the “how”, is the perennially larger question :
Answering the “why?” is what will lead us to understanding where this is all going. That’s interesting. We know why bicycles ended up how they did. That diamond-shaped frame consisting of interlocking triangles is both strong and light. It takes multiple failures to collapse a frame. The curvature of the forks helps the bicycle steering self-centre. The pneumatic tyres use the whole volume of enclosed air for suspension instead of just the tiny patch in contact with the road. It took a while to get there, but the bicycle has evolved to fit it’s purpose.
So what is the purpose of Big Data* ? If you are looking for a commercial answer, rather than an academic response, it is simple. Profit. Google reputedly owns the biggest commercial data store. It makes a useful target for assessing this claim. Google are the Big Data kings. Google’s ability to analyse all those emails, YouTube comments and views, G+ posts, likes, Picasa albums and, of course, web searches currently brings in an estimated $5 billion of revenue. Every month. Some of that is from actual things Google makes, but very little. Most is from advertising revenue, brought in by Google’s immense data store and superior analytics. That’s why they collect data and analyse it. They sell the results pixel by pixel as Google Ads to advertisers hungry to target their marketing dollars accurately. It’s clearly a good business.
If you are good at it.
The road to Google’s dominance in the internet advertising space is littered with the corpses of those that tried and failed. They may have been good at one or other thing – search, photos, free email maybe – you can probably think of the names if you are older than 30. But they were not as good at pulling it all together, analysing it, and creating something of compelling value. Google’s big data was bigger data, their analysis was better analysis, their vision both wider and longer range.
The “why” then seems clear : commercial gain. Google can make $5bn a month selling pixels. A kind of meta-business, making money not by making things, but by trading the bits describing the things that others make. Those little groups of bits, arranged as a text or a banner ad, twinkling at you from your screen or phone, are Google’s gold. The more people browse, the more web pages are displayed. Google attempts to have some of it’s treasured bits, those revenue-attracting Google Ads, on as many of those pages as possible. Their business scales as the internet scales. They don’t even need to buy their data. We offer it to them in blissful collusion – free email ! free photo storage ! free maps ! We provide the data. Google’s analytics make it gold.
You may not think that you too are in the business of selling bits. You might make teapots. Or cars. Or write books. Or help businesses with their technology problems. But some part of your business is selling bits. That teapot maker has a website. A car’s brand image may be won and lost in the opinions and reviews online. Authors may sell their tomes online as eBooks, but the physical books too are promoted via fan sites, online reviews, and word-of-email. The gentleman helping businesses with their technology problems ? Well, his email to his clients is both a proposed solution, but also an advert for future business. It is hard to imagine some commercial activity that does not involve the trading of electronic bits at some level.
Whenever bits are traded, there is a trail and a store. Like energy, bits are never really destroyed. They move somewhere else, are reformed, placed together with other bits. They are retained. They become a source, one that can be analysed.
This is big data. Everyone has it. Everyone contributes to it. It never shrinks.
The winners, commercially speaking, will be those that can use it. We’re probably closer to the Penny Farthing stage of development in big data right now. The successful uses, the best designs, are still being discovered.
It’s not about the bits. The systems, the applications, the wheels and pulleys, the big piles of disks.
It’s about the bits.