Sunday, 30 December 2018

Game State xG Models

I was reminded in Mark Thompson and Ashwin Raman's recent glossary of football analytics that I wrote one of the first posts about the effect of game state on the way each side approaches the game in relation to goal attempts and how they are able to convert these chances.

You can read my 2012 post here & the aforementioned duo's glossary here

Six years down the line and we don't seem to have, publically at least, moved on much from the less than startling conclusion that trailing teams tend to make a greater effort to score by shooting more often, if a little more desperately.

I feel much of the problem is down to a lack of framework to adequately describe game state.

I've posted occasional alternatives to simply using basic score differential over time. The approach I've found most useful involves quantifying the way each team's pre game expectation has decayed as time elapses and the score alters.

This is most pertinent in a tied game. If you're the pregame favourite, a level score line isn't that surprising after 10 minutes, but might be positively disappointing with just ten minutes remaining.

In the former match position, the expected number of league points you might win hasn't moved that far from the initial estimate at kick-off, whereas it has (for the worse) in the latter.

So that's likely to impact on the balance of risk/reward for both teams.

Building such a variable that quantifies this change in pre game expectation as an alternative to using just the raw score line to define game state is a tedious, but fairly basic chore. (Which I've finally gotten around to doing).

Change in expectation is, imo much superior to score line because it mimics the nuance of a match situation much better than using simply goals because it incorporates many factors that are missing when you just focus entirely on who's beating who (or not).

Notably, the time remaining is accounted for.

When I first wrote about game state, but for the generosity of Opta, there was very little granular data around. But now shot location etc is available and the mainstay of every xG model on the market.

I therefore, built two different xG models that use all the usual features and also a term for game state.

One used simply score differential to designate game state (although I'd much prefer that this metric was referred to as score differential.....because....that' and a second incorporated the change in pregame expectation in its place.

I then repeatedly tested these two models on out of sample, regular play goal attempts to see which model performed best.

There are numerous ways to test out of sample prediction, but I chose to group the predictions and outcomes into ascending groups and then quantify how likely it was that the averaged predicted and actual outcome from each group originated from the same distribution.

In repeated trials, around 70% of the time the "change in pregame expectation" approach to game state performed better that just using the score.

One of the main reasons for the improved performance is how tied games are described.

In a typical Premier League season, 75% of the shots taken in a tied game were taken by the side whose current situation, in terms of expected league points won at the end of the match, had declined compared to kickoff.

No comments:

Post a Comment