Friday 21 September 2018

A Brief History of Non-Shot xG Models.

There’s lots of new metrics turning up from non-shot models.

Normal xG is relatively straightforward.
The variables used may differ between models, but there is a core similarity based around shot type and location.

But as more and more “NSxG” models appear it is becoming apparent that one person’s NSxG model can be a completely different beast to someone else’s.

Here’s my broad definitions of what I mean when I use these terms based around the models we have developed at Infogol.

1)    Non- Shot xG

As the name suggests, shots, or more generally attempts at goal, do not hold a position of importance in a NSxG model.

They are simply another data point.

Possession, rather than goal attempts are central to this approach and the outcome variable is whether a goal was scored.

Possession of the ball deep in your own territory will have a relatively small NSxG value because many more such possessions will end with possession being turned over than a goal being scored.

Possession closer to the opponent’s goal is more likely to result in a goal and therefore will have a higher generic NSxG.

The pitch will be defined by a NSxG framework whereby every position on the field will have a NSxG value for the team in possession and the team attempting to take possession.

This is partly analogous to a normal xG probability map, but it is unlikely that the NSxG value will be the same as the xG value for the same position on the pitch.

       2)    Change in NSxG

Hopefully self-explanatory. The difference (positive or negative) in NSxG terms between one position on the field and another.

3)    A team’s NSxG value for a match.

Both NSxG and xG are attempting to describe the process a side has achieved in attempting to produce a favourable outcome.
Namely scoring more goals than they concede.

Both are expressed in expected goals, although one method (xG) looks at a limited subset of events that occurred in the match (goal attempts) and the other (NSxG) looks at every event that occurred, accumulated into separate possession chains.
They are entirely different models, albeit with the same ultimate aim of describing the events of a football match.

NSxG and (shot based) xG values should be broadly similar when summed together for a single game, although the NSxG contains much more granular information than a xG model and so small variations should be expected (and even hoped for).

The measured unit in xG is the expected goals value at the point of the goal attempt.

The measured unit in NSxG is the expected goals value at the initiation of each possession.

4)    NSxG risk / reward.

When a player attempts to move the ball from one field position to another, there exists the combined reward of keeping possession and improving or reducing the NSxG value of the possession at the point in the individual possession chain.

If we include the likelihood that the action will be successful based on either an average passing or ball progression model, we can determine if the action will have a positive or negative expectation from the view point of an average team.

We can further see if certain teams are taking more risky, negative expectation passes or actions, but because they have a repeatable over-performance in completing these actions they are turning negative expectation moves into positive expectation ones.
This ultimately adds context to possession data.

5)    NSxG Timelines.

Using cumulative accumulation of shot based xG for each side as the match progresses has it’s uses, but also critics.

Shots at goal account for less than 2% of game events, whereas many dangerous moves may stall just before an attempt is made.

Therefore, a NSxG approach that incorporates every possession may reveal more about how the match played out.

Simulations, while not immune to score effects, add another layer of information, indicating how likely it is that the match is either currently level or being led by one of the teams.

If we use goal attempts and their xG to simulate these likely states, we often only have around 30 simulation points.

By using NSxG we can increase not only the wealth of match data that is included, but also increase the simulation points by looking at every possession, rather than just every goal attempt.

6)    Player Ratings

Shot based xG major’s on attacking players and playmakers.

NSxG incorporates the small, but often, gains made by players further down the supply chain and can also be used to show how a side's effectiveness changes if an efficient ball circulator (who may not accrue much positive NSxG) is absent.

This allows a gateway into isolating the on-ball contribution made by all players to creating or preventing goals being scored.

7)    Example
12th August 2017 
xG Brighton 0.67 Manchester City 2.24

NSxG for all possessions, including ones leading to own goals.
NSxG Brighton 0.79 Manchester City 1.97

A dominant performance from Manchester City to open their title winning 2017/18 season. Only a 13% they lose the game based on possession chains.

Kevin De Bruyne most influential player in the match.