Sunday, 3 July 2016

The Biggest Expected Goals Shocks of 2016/15

Expected goals models have slowly gained acceptance into the mainstream of football analytics.

Whether they are entirely attempt based, predicting the likely outcome for an attempt based on its characteristics compared to historical precedence or non shot based, the aim is the same.

Namely to examine the process of goal scoring in a probabilistic way to attempt to see which teams possess solid fundamental skills that should bring success in the long term, even if they may not always reap their just rewards in the short term.

To mimic actual scorelines, expected goals match summaries are often presented as a cumulative total of the individual expected goals accrued by each team through their efforts in the match.

For example, a side may actually win the game by 2-1, while posting equivalent expected goals totals of say 1.78-1.05.

Intuitively the actual score feels fair and proper. The first named team out scored their opponent both in terms of actual and expected goals and the respective totals are relatively similar.

There are some well documented pitfalls from using cumulative expected goals, notably how a side's expected goals is distributed over their attempts, particularly in terms of so call big chances.

Simulating each chance is the most obvious way to reacquaint expected goals conclusions with the granular nature of the original data.

In comparing expected goals conclusions to actual score lines we should try to sift those optimistic sides who hope for an occasional goal bonanza by trying their luck often and from distance and those continually strive to create fewer, gilt edged chances.

On opening weekend, Arsenal created 21 opportunities, cumulatively amassing around 1.7 expected goals compared to 8 visiting West Ham attempts totalling barely half an expected goal.

West Ham still won 2-0.

Simulating each individual attempt in the game results in Arsenal "winning" around 70% of the time and West Ham just 7%, with 23% of iterations drawn. An 82% success rate for the Gunners, where draw are counted as half a win.

On the day, West Ham's success rate was a perfect 100%. But even when using actual scorelines there may exist different levels of dominance.

The record margin of victory in a Premier League game was Manchester United's 9-0 thrashing of Ipswich Town in 1995. Norwich and Liverpool in 2015/16 also played out a nine goal game, where Liverpool edged the match 5-4.

Comparing these two actual wins, with very different margins of victory, it is perhaps intuitive to think that Manchester United could only be credited with a 100% success rate, whereas Liverpool's single goal win in a goal feast is perhaps less worthy of a perfect score.

Actual goals scored and allowed can be converted into a more probabilistic final reckoning in a variety of ways and those leaving Carrow Road in late January may not have quibbled had it been suggested that Liverpool might have only deserved a success rate marginally above 50% based on the "anyone could have won" 5-4 final score.

West Ham's 2-0 win at the Emirates should perhaps lie between Liverpool's fortunate 5-4 win and United's record breaker 9-0.

A success rate of circa 90% perhaps, in a sport that is usually low scoring might seem a reasonable estimate for the Hammers' actual scoring and conceding achievement in overcoming Arsenal 2-0.

We now have ways to express actual and expected scores in the same currency of probabilistic success rates, so we can compare the two figures for a single match to see where the divergence is greatest.

And that occurred at the Emirates when Arsenal (1.7 expected goals) lost 2-0 to West Ham (0.4 expected goals).

The season's second biggest disconnect between scoreline and expected goals occurred on the final scheduled day of the season when Stoke (0.4 expected goals) beat West Ham (3.1 expected goals) by 2 actual goals to 1.

What goes around................


  1. I see you're quoting xG figures for matches from round 1 and Round 38 of last season's fixtures. I'll take a wild guess this means you have numbers for all 380 games?

    Have you made your xGs for the 2015-16 season public? I would love to see them !

  2. Hi Albert,
    hopefully they may see the light of day in the near future.

    cheers, Mark