Monday, 1 February 2016

How To Frame An Individual Match Outcome.

A simple method to frame your own match odds using historical goal or expected goal data. We'll look at Sunderland's upcoming home game with Manchester City. City unsurprisingly are strongly favoured.

Here's what you need.

1) The average number of goals or expected goals scored by the home and away teams in the competition.

So you can take data from this season or last season or a weighted average of a number of seasons. Your choice, you can validate your model against out of sample games later to see what works best.

2) The average number of goals or expected goals scored and allowed by Man City and Sunderland. Again time frame is up to you. I don't differentiate between home and away goals, that comes later. Why would you want to chuck half your data away or risk over fitting a "home or away specialist"?

Also the team figures haven't been regressed by adding a proportion of league average. We're just looking at the basic process here.

That's it.

Here's some representative figures. Home teams are scoring 0.25 goals per game more than visitors, 1.49 compared to 1.24. The average game has 1.37 expected goals per team. (Basically just the mean of the first two figures).

Sunderland are scoring few and allowing lots. Vice versa for City.

We want to find Sunderland's average expected goals at home against Man C. So these figures are more usefully expressed as rates.

Sunderland score 1.09/1,37 or 0.79 times the rate of scoring in the competition.

Man C allow 1.16/1,37 or 0.85 times the rate of conceding in the competition.

Sunderland are at home and home teams score 1.49/1.37 or 1.09 times the average rate for this competition.

Multiply these three rates together 0.79*0.85*1.09 = 0.73

Sunderland are likely to score at 0.73 times the league average number of goals at home to City. The league average expected goals for the competition is 1.37 goals.

So in terms of expected goals Sunderland might average 0.73*1.37 = 1.00 expected goals.

Do the same for City.

City score 1.92/1.37 = 1.40 times league average.

Sunderland allow 1.91/1.37 = 1.39 times league average.

Away teams score 1.24/1.37 = 0.91 times league average.

Man C are likely to score 1.40*1.39*0.91*1.37 expected goals = 2.43 expected goals.

So Sunderland have an expected goals average of 1.00 goals and Man C has 2.43 expected goals. We're in Poisson territory now and a plain, non-tweaked Poisson gives the following match predictions.

Compared to the current Oddschecker % of 13% Sunderland, 21% the draw and 67% Man C.


  1. Another excellent piece mate. Can I just ask a question. When you say "Average expected goals scored by all home teams" is 1.49 and ""Average expected goals scored by all away teams" is 1.24...

    ...are those number derived from actual historical results? Or from an expected goals model?

  2. Hi there,

    they're derived from the historical expected goals models from previous seasons.

    cheers, Mark

  3. Thanks for getting back so quickly. This is very interesting - do you think using expected goals is superior to using actual goals? And would your answer differ depending on the sample size of games that you go back through?

  4. Mark,Were can i get those expected goals data?

    1. Getting the data ranges from self collection to relying on some kind soul, such as Paul Riley aka @footballfactman posting it online.

      Here's his current link


  5. Did you use the Xg to calculate the Poison distribution?

  6. Hi Eric,

    you can use a smoothed or weighted average over a prolonged period of matches to get expected goals figures that are representative of the side's likely current abilities.

    cheers, Mark

  7. Hi Mark
    How to calculate those %outcomes basing on those xg

  8. Hi Mark

    Just reading through your site, absolutely gripped. Is there a source of data on the web that can display form tables for 30+ games across seasons or do you draw it from your own database?


  9. Hi Paul,
    the best raw data can be found at Joe's excellent site Historical results in csv format for easy pasting into a spreadsheet. You can then set up a rolling average (just remember teams play home and away :-)).

    The xG rolling averages come from my work database, glad you like the blog.