Daniel Johnson makes remarkably accurate Olympic medal predictions. But he doesn’t look at individual athletes or their events. The Colorado College economics professor considers just a handful of economic variables to come up with his prognostications.
The result: Over the past five Olympics, from the 2000 Summer Games in Sydney through the 2008 Summer Games in Beijing, Johnson’s model demonstrated 94% accuracy between predicted and actual national medal counts. For gold medal wins, the correlation is 87%.
His forecast model predicts a country’s Olympic performance using per-capita income (the economic output per person), the nation’s population, its political structure, its climate and the home-field advantage for hosting the Games or living nearby. “It’s just pure economics,” Johnson says. “I know nothing about the athletes. And even if I did, I didn’t include it.”
– Forbes, see also Daniel Johnson’s own website
Which countries tend to do best in the Winter Olympics? The ones with large populations, cold winters, and wealth. Nothing surprising there.
And yet, the strength of these connections and the accuracy of Johnson’s predictions is impressive. And it is perhaps surprising that this accuracy is achieved free from any data on the athletes.
This hints at a revolution still in its infancy, and one with with great promise: Uncover surprising, far from intuitive and yet important connections, using statistics, vasts amount of date, and modest computer power.
The recipe is simple: Take data on a large variety of apparently unrelated factors. Run a statistical analysis. Find significant yet surprising connections. And then look at the possible connections and mechanisms behind these connections.
Several years ago, I found a website using this approach. They collected a large data set from visitors answering questions on a wide range of topics. Sifted through this data to find surprising correlations. Used this to design new questionnaires. And were able to predict certain things about their new visitors with great accuracy, based on their answers to just a few questions apparently unrelated to the prediciton. I remember being hugely impressed that they figured out my gender with 98% or so certainty, only based on my answers to a relatively brief set of questions seemingly completely unrelated to gender. It was even more impressive since I don’t see myself as stereotypically male. I unfortunately lost track of this site, and haven’t been successful googling it.
When it comes to the upcoming 2010 Winter Olympics, Johnson predicts that Norway comes in third after the US and Canada, with four gold and 26 medals overall. We’ll see in two weeks!
Update: It is now the night of the last day of the Olympics, and Johnson’s predictions were very accurate. Norway came in fourth after Canada, Germany, and the US. They got 9 gold medals, and 23 medals total.
Update 2: Here is the statistics on how many medals different countries received in relation to their population. This is an example of the cultural and resource side of the equation.
1. Norge 4,73 medals per million inhabitants.
2. Østerrike 1, 91
3. Sverige 1,17
4. Sveits 1,15
5. Slovenia 0,97
6. Finland 0,93
7. Latvia 0,89
8. Canada 0,76
9. Estland 0,74
10. Kroatia 0.67