Prediction Analysis by Steve Lawrence

With 8 rounds to go in the Premiership and 6 rounds to go in the Eredivisie this is my penultimate blog for the Scoreboard Journalism Challenge.

Points and place predictions were invited from prominent media, stats modellers, fans and online publishers and the following blog entries have recorded the season’s progress so far:

The Premier League is now at Round 30 and the Eredivisie is at Round 28 and it’s an interesting moment to compare the two leagues.

In earlier posts I’ve argued the case for plotting the coefficient of variation against the standard deviation just as a way of visualising the shape of each forecaster’s prediction, the scatterplot below shows how the various predictions across both competitions shape up on this basis. The two distinct groupings are simply due to the differing numbers of teams in the two leagues.



If we look in a bit more detail at the Premier League after Round 30 (except for Chelsea & Leicester on 29 matches) we see that the CV remains at around 0.35 which is both similar to last season and has been fairly stable since Round 12. It also implies an end of season SD of about 18 which is at the top end of stats model expectations.



In the Eredivisie we see a story where the CV is quite different to last season but where it is also at 0.35 close to where it was at Round 12. In this case an end of season SD of about 16.5 looks likely and which is similarly at the top end of the stats model expectations.



In both cases the stats models have undershot the out-turn CV.

Whilst the CV v SD charts give an idea of the shape of the points distributions they don’t capture the all important order of the respective tables.

To do that I’ve analysed Pearson’s r for each of the forecasts and when plotted against the coefficient of variation I feel that both the table order and the shape of the points distribution can be captured in this way.

On this basis the Premier League looks like this:



and the Eredivisie looks like this:



The progress of the CV v Pearson’s r correlation for the actual league table is shown by the yellow data points where r is always 1. Any forecaster who achieves both r = 1 and the actual CV will have predicted the table exactly as it is.

On this basis @Etnar_UK, @stevelawrence_ &  @MC_of_A are the front-runners for the Premier League and Emile Schelvis, @stevelawrence_ & @roberttempleman are the frontrunners for the Eredivisie.

For an alternative assessment see @jamesgrayson who provides the definitive ranking and which is regularly posted.

It is very interesting to compare the respective bookmaker’s forecasts. The Premier League bookmaker forecast is looking good with only 9 other forecasts doing better. Is that because the league is predictable or is it perhaps because being the league with the most media attention it is also the most analysed?

The bookmaker forecast for the Eredivisie on the other hand is not so accurate with 21 predictions ahead. Is that because the Eredivisie is less predictable or perhaps because it’s less closely analysed?

The distribution of the various forecasts across the two scatterplots seems to be fairly similar with a similar spectrum of CV and Pearson’s r and that would seem to indicate that the two leagues are as predictable as each other. In terms of beating the bank though the Eredivise would seem to be the better bet.

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s