Wednesday 12 February 2014

Twelve Shots Good, Two Shots Better.

The proliferation of shot based models has lead to some excellent progress towards quantifying the primary talents of both strikers and goalkeepers. The importance of both shot type and shot location in determining the likelihood of success has been refined by the inclusion of important variables such as shot placement, as well as pace and power of the attempt. Small but steady progress is also being made towards adding defensive pressure to the mix along with keeper positioning and readiness.

Shot attempts and saves, when measured against a robust expected baseline can increasingly be used to identify striking and keeping talent, but the use of shot models to highlight potentially fortunate team wins or unlucky losses, may be more problematical. The fluid nature of the game inevitably means that a shot scored will inevitably alter the path a match takes compared to that same shot being saved. Not only do subsequent events inevitably take different courses in the two mutually exclusive scenarios, but the game state, a product of the score, relative abilities of each side and the time remaining will also meet of fork in the road, depending upon the success or otherwise of a single attempt.

Identifying problems is often halfway to a solution. for example, shots originating instantly from rebounds are the easiest to deal with. By allowing a side to score or concede just once from such intimately connected goal attempts, we can mitigate the problems of a second chance being dependent upon the first chance being missed.

However, even if we accept that goal attempts may quickly merge into the mundane ball recycling of the middle third of the pitch, before the desire is sufficient for another unrelated opportunity to be created, we may still be underplaying other factors, such as chance quality and quantity.

Late on Sunday afternoon, Fulham, were comprehensively out-shot, out-crossed, but not out scored on a visit to Old Trafford. United had 31 shots, nine on target for their two scores compared to Fulham's six attempts, three on target in claiming a share of the spoils. Even from shot location alone, it was apparent that Fulham scored from two exceptional chances.

Indeed Opta define their "Big Chances" as either a one on one situation with the keeper or an attempt from very close range and the opportunities that fell to Sidwell and Bent probably met both of those requirements. To further quote Opta both players could have "reasonably expected to score".

For all of United's gradual accumulation of large amounts of low grade chances and the goal expectation that went with it, Fulham created two high quality "one on one" chances from relatively close range.

Therefore, does a straight comparison of cumulative goal expectation give a realistic idea of what might/should have been, or is the spotting of "big chances" within the sample capable of skewing potential outcomes in a way that may not be apparent when simply looking at cumulative goal expectation.

To try to test the power of big chances, I simulated a match where two sides each accumulated a post game expectation of 1.2 goals based on the location of their chances. None of the attempts were as a result of rebounds, so we are looking at a simplified "shooting competition" , with equal goal expectation.

Team A created just two scoring chances, but each was of very high quality. They were both big chances and I've nominally assigned a 60% chance of scoring to each attempt. Team B, by contrast created 12 chances, but each of relatively poor quality. To maintain a goal expectation of 1.2 goals for each side, I gave each shot a 10% likelihood of success. So it is an artificial situation, but hopefully a test of the effect of goal expectation being unevenly spread among varying shot quantity.

Potentially, team A, based on just two goal attempts can only score a maximum of two goals in a shot based simulation, compared to a maximum, if highly unrealistic 12 goals should all of team B's lower probability attempts find the back of the net. The goal expectation based on their shots taken during this virtual game is the same for each side, but is there a long term advantage to creating a couple of guilt edged opportunities, instead of taking more numerous, but less likely to convert shots?

In short, do distributions matter, as well as raw goal expectation?

Running the simulation appears to show a distinct advantage for the team that creates isolated "big chances" compared to a side that steadily accumulates regular, good but not great opportunities. If a side generates more attempts in reaching the same post game goal expectation, their range of likely winning scorelines increases, but over the long stretch, in this artificially extreme example, a couple of big chances appear to get you more wins.

Simulating 20,000 Matches Between 2 Sides which Generated Identical Goal Expectations over widely Differing Shot Counts.

Team Av. Goals Scored. Av. Goals Allowed. % Wins. % Draws. % Losses.
1.2 Exp. Goals Spread Over Two Shots. 1.20 1.20 37.4 30.4 32.1
1.2 Exp. Goals Spread Over 12 Shots. 1.20 1.20 32.1 30.4 37.4

This result may simply be an artifact of my method. Although a less extreme example, where a goal expectation of 1.2 for each side was spread over 10 and 12 shot attempts respectively, gave a similar advantage for the side creating one clearer cut opportunity. Over 100,000 iterations, a side having 9 chances each with a goal expectation of 10% and one of 30% won the "game" 36% of the time compared to 35% for the team that created twelve 10% chances.

Accrued goal expectation may be the prime driver in evaluation if an actual result was deserved, but the ability to create a couple of outstanding chances (which may or may not be repeatable at team or player level) may also play a minor, yet important role. In certain circumstances, the benefit may be greater than simply the sum of the parts.

The whole concept of shot based models is developing at a rapid rate, so for further reading I strongly recommend you track these guys down on Twitter @colinttrainor, whose excellent work appears on StatsBomb, @footballfactman, who writes as Paul Riley about keeper evaluation, @11tegen11, who is pushing the use of expected goals models in an exciting predictive direction and @The_Woolster, who has must read views on Opta's big chance metric. Apologies to anyone I've omitted, feel free to post in the comments section.


  1. Excellent article . You may want to look at the affect of game state on accuracy . Last 67 games in the top 5 leagues in Europe 0-1 HT and early away goal 0-20 mins and only 5/67 home wins so low expectation of a home win . The is doing some good work in this area as well .

    1. Hi Jonny,
      I'm sure we both agree that game state (not just the current scoreline) is an extremely important factor in determining the balance of risk/reward strategies employed by each side. And that in turn may alter how they accumulate less commonly recorded data, such as shots at goal.

      You make a very good point.


  2. Hey, I’m a chemical engineer, so while my knowledge of statistics is limited to it’s application in my field, I would really like to get into football analytics, and I’d love to contribute, if possible, to your expected goals model. Could you tell me where the best publicly available data is, and, more importantly, the best way to access it? Thank you, Chris. The best place to contact me is

    1. Hi Chris, i'll get back to you over the weekend. Mark

  3. Interesting.

    However your simulation was about two evenly matched teams. Would the results be the same if you put two uneven teams together? For example compare a team with 4 x 50% shots against the two teams above.

  4. The simulation may be too abstract in that when a goal is scored will change a team's style and the number of opportunities they create. A side capable of having ten shots in a match can more easily have eleven than a side with two chances can have a third.

    I think Opta's 'big chances' measure badly put together. One-on-ones are generally harder chances than people realise, and e.g. unchallenged shots from a player running onto his best foot 20-25 yards out and pullbacks to the edge of the penalty area are easier chances.

  5. There might be a problem with the simulation method.... or it is something I misunderstood: In the table showing the results of the simulation it is stated an "Av. Goals Allowed" of 1.2 in both cases. That doesn't look like guarranteeing an expected value of 1.2 Goals

  6. Hi Karolos,
    it's a typo, should read "expected goals allowed".

    cheers, Mark

  7. Hey, I just checked your numbers with statistics only and came to conclusion, the numbers are correct. Nothing to do with your method. If you have 2 shots you have a way higher chance of scoring one or two goals, but a lower chance of scoring more. For the 12-shot team they have a higher (but still low) chance of scoring more than 3 goals, but that doesnt really help because they would win with 3 goals anyway. And thats where they lose the winning effect of there expected goals. I can give you the spreadsheet if you want (just reply)