Sunday 30 December 2018

Game State xG Models

I was reminded in Mark Thompson and Ashwin Raman's recent glossary of football analytics that I wrote one of the first posts about the effect of game state on the way each side approaches the game in relation to goal attempts and how they are able to convert these chances.

You can read my 2012 post here & the aforementioned duo's glossary here

Six years down the line and we don't seem to have, publically at least, moved on much from the less than startling conclusion that trailing teams tend to make a greater effort to score by shooting more often, if a little more desperately.

I feel much of the problem is down to a lack of framework to adequately describe game state.

I've posted occasional alternatives to simply using basic score differential over time. The approach I've found most useful involves quantifying the way each team's pre game expectation has decayed as time elapses and the score alters.

This is most pertinent in a tied game. If you're the pregame favourite, a level score line isn't that surprising after 10 minutes, but might be positively disappointing with just ten minutes remaining.

In the former match position, the expected number of league points you might win hasn't moved that far from the initial estimate at kick-off, whereas it has (for the worse) in the latter.

So that's likely to impact on the balance of risk/reward for both teams.

Building such a variable that quantifies this change in pre game expectation as an alternative to using just the raw score line to define game state is a tedious, but fairly basic chore. (Which I've finally gotten around to doing).

Change in expectation is, imo much superior to score line because it mimics the nuance of a match situation much better than using simply goals because it incorporates many factors that are missing when you just focus entirely on who's beating who (or not).

Notably, the time remaining is accounted for.

When I first wrote about game state, but for the generosity of Opta, there was very little granular data around. But now shot location etc is available and the mainstay of every xG model on the market.

I therefore, built two different xG models that use all the usual features and also a term for game state.

One used simply score differential to designate game state (although I'd much prefer that this metric was referred to as score differential.....because....that' and a second incorporated the change in pregame expectation in its place.

I then repeatedly tested these two models on out of sample, regular play goal attempts to see which model performed best.

There are numerous ways to test out of sample prediction, but I chose to group the predictions and outcomes into ascending groups and then quantify how likely it was that the averaged predicted and actual outcome from each group originated from the same distribution.

In repeated trials, around 70% of the time the "change in pregame expectation" approach to game state performed better that just using the score.

One of the main reasons for the improved performance is how tied games are described.

In a typical Premier League season, 75% of the shots taken in a tied game were taken by the side whose current situation, in terms of expected league points won at the end of the match, had declined compared to kickoff.

Friday 28 December 2018

The Benefits of Being Subbed On.

I first posted about the advantages enjoyed by a sub 20+ years ago on something called usenet, before it got swamped in an avalanche of spam and became unusable.

I've added a few blogs here as well. Here's on centred around Edin Dzeko

The basic concept is very straightforward.

Goal scoring rates gradually increases as the match progresses, so if you're always getting subbed on in the 70 minute, you're playing in a very different scoring environment compared to someone who starts the match.

You've also been lounging around for an hour or so, while everyone else has been running their socks off.

That's not entirely the whole story. Game state and team talent differential also has a say.

A much better team won't be cranking up the attacking process quite as much in the first half of a tied match, compared to the final twenty minutes, if the score line status quo has been maintained.

Changing game states in tied games with a large talent differential between the teams are biggish deals, both of which we'll ignore in this post.

Working out a rough and ready goal environment for players based on when they were on the field is a fairly trivial task.

All you need is a decay factor, an initial expected scoring rate and a spreadsheet and you can easily calculate the goal expectancy (not to be confused with expected goals) for any minute in a match.

For an average team, the goal expectancy during the 80th minute is around 33% bigger than the equivalent during the 10th minute. So the frantic last ten minutes is very different to the languid first ten.

Shaq.....Once a Red & White, always a Red......

I finally got around to working out a way to quantify the "subs premium" when Shaq's offensive production for Liverpool started popping up on Twitter over Christmas.

I though, (wrongly as it mostly turned out), that his stats had been padded by playing most of his minutes late on in games as a sub.

His numbers have definitely benefitted from small sample extremes, (which may or may not be maintained), but that's another issue entirely.

He's made 14 appearances for Liverpool in the Premier League, six from the bench. Depending upon how you treat added time, that's around 760 minutes of playing time.

To see by how much Shaq's benefitted from playing when the goal environment has been cranked up, we just need to take a baseline figure for a side's goal expectancy over a full game.

Then work out the goal expectancy for each individual minute.

Add up the relevant goal expectancies tied to each minute Shaq has played.

Compare this to his theoretical goal expectancy he would have if every minute he played was equally spread across the 90+ minutes of a game.

Bottom line, Shaq's 760 minutes, equally spread over a match, based on him playing for the average Premier League team he used to grace, equates to a goal expectancy of 10.39 goals.

This compares to 10.59 goals based on the actual identity of every actual minute he's been on the field for.

His split of sub and starting appearances has benefitted him by around a 2% increase in goal scoring environment compared to par.

His numbers aren't particularly boosted by him having disproportionately large opportunities to feast on the weary, late in games.

Good buy.