Monday 22 January 2018

After the Shot xG2

Expected goals has variously been defined by advocates and opponents respectively as a more accurate summary of what "should" have happened on the pitch or a useless appendage to the final scoreline, that is neither useful nor enlightening.

The first description is perhaps too overtly optimistic for a "work in progress" that is evolving into a useful tool for player projection and team prediction.

Whereas the second, less flattering description, may also stand up to some scrutiny, particularly if the supporters of the stat ignore the uncertainty intrinsic in it's calculation, while the detractors may be blithely ignorant of such limitations.

Both camps are genuinely attempting to quantify the true talent levels of players and teams in a format that allows for more insightful debate and, in the case of the nerds, one that is less prone to cognitive bias.

The strength of model based opinion is that it can examine processes that are necessary for success (or failure), drawing from a huge array of similar scenarios from past competitions.

And in doing so without straying too far down the route from chance creation to chance conversion (or not), so that the model avoids becoming too anchored in the specifics of the past, rendering any projections about the future flawed.

Overfitting past events is a model's version of eye test biases, but that shouldn't mean we throw out everything that happens, post chance creation for fear of producing an over confident model that sticks immutably to past events and fails to flexibly project the future.

It's no great stretch to model the various stages from final pass to the ball crossing the goal line (or not).

Invariably, the process of chance creation alone has been prioritised as a better predictor of future output and post shot modeling has remained either a neglected sidetrack or merely the niche basis for xG2 keeper shot stopping.

But if used in a less dogmatic way, mindful of the dangers of over fitting, the "full set" of hurdles that a decisive pass must overcome to create a goal (or not) may become a useful component in an integrated approach that utilises both numeric and visual clues to deciphering the beautiful game.

Lets look at chances and goals created from set pieces and corners.

Here's the output from two expected goals models for chances and on target attempts conceded by the current Premier League teams in the top flight since early 2014.

The xG column is a pre shot model, typically used to project a side's attacking or defensive process, that uses accumulated information, but is ignorant of what happened once contact with the ball was made.

The xG2 column is based entirely upon shots or headers that require a save and uses a variety of post shot information, such as placement, power, trajectory and deflections. Typically this model would be the basis for measuring a keeper's shot stopping abilities.

A superficial overview of the difference between the xG allowed from set pieces and actual goals allowed leads to the by now familiar "over or under performing" tag.

Stoke had been transformed into a spineless travesty of their former defensive core at set plays, conceding both chucks of xG and under performing wantonly by allowing 42 actual goals against 37 expected.

There's little disconnect between the Potters' xG2, that examines those attempts that needed a save, but the case of Spurs & Manchester United perhaps shows that deeper descriptive digging may provide more insight or at least add nuance.

Tottenham allowed a cumulative 29.6 xG conceding just 23.

We know from keeper models that Lloris is generally an excellent shot stopper and the xG2 model confirms that, along with the ever present randomness, the keeper's reactions are likely to have played a significant role in defending set play chances.

In allowing 23 goals, Lloris faced on target attempts that worth just over 31 goals to an average keeper.

29.6 xG goals are conceded, looked at in terms of xG2 this value has risen to 31.3, so still mindful of randomness, Spurs' defenders might have been a little below par in surpressing the xG2 attempts that came about from the xG chances they allowed, but Lloris performed outstandingly to reduce the level of actual goals to just 23.

Superficially, Manchester United appears identical.

As a side they allowed 37.6 xG, but just 32 actual goals. we know that De Gea is an excellent shot stopper, therefore in the absence of xG2 figures we might assume he performed a similar service for his defence as Lloris did for his.

However, United's xG2 is just 33.1 and the difference between this and the actual 32 goals allowed is positive, but relatively small compared to Lloris at Spurs.

By extending the range of modeling away from a simple over/under xG performance we can begin to examine credible explanations for the outputs we've arrived at.

Are United's defenders exerting so much pressure, even when allowing attempts consistent with an xG of 37.6 that the power. placement etc of those on targets efforts are diluted by the time they reach De Gea?

Are the attackers themselves under performing despite decent xG locations? (Every xG model is always a two way interaction between attackers and defenders).

Is it just randomness or is it a combination of all three?

Using under and over performing shorthand is fine. But we do have the data to delve more into the why and taking this xG and xG 2 data driven reasoning over to the video analysis side is the logical, integrated next step.

No comments:

Post a Comment