This is a guest post by Chad Murphy (@soccermetric)
Most readers of this blog probably already know, but Simon Gleave (@SimonGleave) and Steve Lawrence (@SteveLawrence_) have been running a competition, comparing the accuracy of pre-season EPL predictions for a variety of people. The competition includes over 90 fans, prominent members of the media, and statistical modellers, and the results have really been interesting. I wanted to look at one aspect in particular: what types of predictions have been the most successful? Put another way, do statistical models beat intuition of experts?
Philip Tetlock, a Psychology Professor at the University of Pennsylvania, made news with his book Expert Political Judgment: How Good Is It? How Can We Know? In his work he tracked more than 80,000 predictions over 20 years, and found that when experts predicted various world events, they barely outperformed undergraduates and were only slightly better than random chance. The idea that the “experts” are barely able to beat smart undergraduates was fairly surprising, so I wanted to use Simon’s competition data to re-test Tetlock’s theory with the Premier League. And what better way to do it than to see if members of the media are truly experts?
Below I present a violin plot, comparing the three groups showing the correlation between the pre-season predictions and the table after Week 24. This shows the general distributions of each of the different groups, where each of the individual predictions are and where they tend to cluster. The vertical line shows the correlation between the simplest possible model and this year’s results: everything is exactly the same as last season. Even in a difficult season to predict like this one, the goal would be to improve over a simple cut and paste of last year’s table.
It turns out that the experts are well behind the statistical modellers, and are statistically tied with fans[1. An unbalanced t-test between fans and media shows the two groups are indistinguishable statistically, while modelers score significantly higher than both groups at p < 0.05]. This is despite the fact that the very worst individual predictions are all from statistical models.
Additionally, the plot shows that only 27 people made predictions that are currently an improvement on “everything is the same as last year.” Of the 27 predictions beating the simple model, 24 were statistical models, 2 were fans, and only 1 was a member of the media. The moral of the story is this: the experts are well behind the statistical modellers in this unpredictable season: if you want to predict the future, rely on analytics rather than intuition.