Avoiding rank journalism (1): Good start for Vitesse no surprise based on the fixtures

Now that the weekend’s Eredivisie fixtures have been completed, it is worth returning to the expected points table based on the bookmakers’ assessments to see whether there are any clubs over- or underachieving at this stage.

As explained in this earlier post, I am using a simple expected points model, based on the bookmakers’ odds for each Eredivisie fixture, to assess whether there are any surprises in the current Eredivisie ranking. Below is the ranking from this model with the current Eredivisie table next to it:

The highlights after four matches are as follows:

1. Vitesse should have been expected to currently feature in the top four given their fixtures to date and thus their second place is no surprise and says very little now about whether they will challenge for a Champions League place. The bookmakers regarded Vitesse as a top six side at the beginning of the season and, for the moment, there is no reason to consider them to be better than this. The top four in the expected model at this stage are identical to the top four in the current table but in a different order.

2. Heerenveen, given their tougher start to the season, are doing fine in 13th place. The expected model had them in 16th at this stage, albeit with around an extra point from their first four fixtures.

3. Only two clubs are more than three places removed from where they were expected to be using the odds model. NEC are six places higher than expected in seventh position and Heracles are four places lower in 14th.

4. Zwolle, NAC, Willem II and VVV were all predicted to be in the bottom five at this stage given their initial four fixtures and all four are indeed ranked from 14th to 18th.

This entry was posted in Uncategorized and tagged , , , , , . Bookmark the permalink.

7 Responses to Avoiding rank journalism (1): Good start for Vitesse no surprise based on the fixtures

  1. Ad Poppink says:

    This reasoning is totally flawed, because conclusions are based on comparing expected points with observed ranking. Expected points should be compared with the observed points.
    For example:
    -Vitesse has 48% more points then expected, which is significantly more and can be qualified as overachieving.
    -Heerenveen you say is doing just fine because they are 3 places above their expected ranking, which is a derivation (and therefore less accurate) from the expected points. While in fact they have 21% less points then expected.

  2. Not quite fair I feel. I have used the expected points to generate an expected ranking. At this stage of the season, comparing points in the way you have will not make sense due to the influence of a single match over such a small sample size. I don’t see an obvious improvement to this in generating an expected ranking but I am certainly open to ideas if you have any.

    We can only really assess whether Vitesse are overachieving once we have more matches. Same goes for Heerenveen of course. I am only referring to their rankings when I do this and not their points – observed or expected – which I have not referred to in the text of the article. I have only included them in the tables for completeness.

  3. Ad Poppink says:

    It’s totally fair. You base your conclusions (i.e. highlights) on derived data instead of the actual data, which is in any case less acurate then the actual data. To do this would only makes sense if you couldn’t obtain the actual data, which is not the case, so there is absoluty no reason for doing so.
    Your argument of a small sample size applies to all the data, the actual points or the ranking are equally effected by the the small sample size. So small sample size can not be an argument in favor of either of them.

  4. Ad Poppink says:

    Let me elaborate with an example how derived data are less accurate then the actual data.

    Your highlight:
    “Heerenveen…….are doing fine in 13th place. The expected model had them in 16th”.

    If the following games had turned out differently: Heracles-VVV not 1-1 but 2-1, Zwolle-Vitesse not 0-1 but 1-0 and Willem II – NAC not 1-1 but 2-1 then Heerenveen would have been 16th in the ranking and you could not have made the assesment that Heerenveen is doing just fine based on the fact that they are currently 13th. That’s odd because in both cases Heerenveen performed exactly the same vs Nec, Ajax, Feyenoord and AZ.

    You are improving the assesment how a team is doing by using bookmakers’odds to take into account the strenghts of opponnents met so far. Then you let your assement how a team is doing depend on games that team wasn’t in and has no influance over at all.

    The results of Heracles-VVV, Zwolle-Vitesse and Willem II-Nac should not influance the assesment how Heerenveen has done vs. Nec, Ajax, Feyenoord and AZ.

    If you not let other games be of influance and use only the actual data then having 3 points instead of 3.81 cannnot be qualified as doing just fine. The proper assesment would be: “Heerenveen is doing sleightly less well then expected when the opponents strenght is taken into account.”

    Using derived data (i.e. expected/observed rank) introduces a random factor (results of games the team wasn’t in) into the assesment on how a team is doing and there is less accrate then using only the actual data (expected/observed points).

  5. Thanks for your comments Ad. I think we both agree that this sample size is too small at this early stage of the season. My point was to try to assess whether Vitesse and Heerenveen in particular were surprisingly high or low. If you have a better idea of how to do this, I would be very open to it.

  6. Ad Poppink says:

    The point of the article was avoiding rank journalism (i.e. qualifing teams as over- or underachievers based on ranking). You did this by taking into account the strenght of opponents met. Then you got off track by basing your qualifications on comparing the expected rank with the observed rank which again is rank journalism, the thing you wanted to avoid in the first place.

    The solution is rather simple I think, just compare the expected points and the observed points to determen if a team is over- or underachieving. Vitesse has 48% more observed points then expected, Heerenveen 21% less, Twente 41% more, PSV is doing as expected with a 0% difference in points. Calculate this for all teams and you can make a ranking based on these percetages. After that you have only to decide at what percentages a team can be qualified as an overachiever, a sleight underachiever, doing as expected and so on. Which is off course all is arbitrary, one can choose which ever qualifications one likes and use them at percentages as see fit.

    This way you can also give more detailed explanations why a team ended where it did after the season is over. For example (I am making these points up off course) let’s say according to the bookmakers it’s going to end this way after 34 games:
    1. PSV 79 points
    2. Ajax 78 points
    3. Twente 76 points.
    ta the end of the season things end in reality like this:
    1. Ajax 82 points,
    2. Twente 81 points
    3. PSV 77 points.

    Qualifying based on rank Ajax is an overachiever (1st instead of 2ed), Twente is an overachiever (2ed not 3rd) and PSV is an underachiever (3rd not 1st). Qualifying based on comparing expected points with observed points they all get the same qualification in this case.
    Explanation why PSV didn’t win the league in both cases PSV is an underachiever the Ajax and Twente are overachievers.

    But now let’s change the observed points a bit.
    1. PSV 79 points
    2. Ajax 78 points
    3. Twente 76 points.
    ta the end of the season things end in reality like this:
    1. Ajax 82 points,
    2. Twente 81 points
    3. PSV 80 points.

    Based on rank you get the same qualification for the teams and the same explanation why PSV didn’t win the league. But if in this case if you qualify based on expected/obeserved points you have 3 overachievers. Then the explanation would be a different one: PSV did overachiver sleighty but Ajax and Twente were bigger overachievers and that’s why PSV still didn’t win the league despite they were overachieving.

    So you wil get more differentiated explanations using expected/observed points. Based on rank there is only one qualifiction possible for a team that ended up lower then expected (underachiever). Based on expected/observed points there are still three qualifications possible (under- overachiever or as expected). And the final explanation why a team ended on a certain position will be also based on the qualifications of the other teams.

    Sample size is a thing I shouldn’t worry about because that is a complety different matter. Sample size is only important if you want to say something about the likelyhood if an overachieving team now, still will be an overachiever at the end of the season. But that’s not the point of the article, the point of the article is qualifing teams as over- or underachieving not based on rank.


  7. I will take these excellent comments on board for the next piece I write on this. Thanks very much

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s