Friday, 28 May 2021

What is Goal Expectation?

Let's say you want to make an informed estimation about the upcoming England vs Scotland game at Wembley Stadium in Euro 2020 (2021).

One route would involve estimating the average number of goals England are likely to score against Scotland at Wembley and the average number of goals Scotland would score against England at the same venue.

You could then take a mathematical route to calculate the probability that two side with these average  goal expectation estimates would result in a home win, away win or a draw.

Typically a Poisson approach.

The average number of goals expected to be scored or allowed by a side in a future game has for over 30 years been referred to as their goal expectation

Unfortunately, a more recent and widely discussed metric based on the chance quality of a scoring opportunity, has arrived on the scene and taken the very similar name of expected goals.

They are not the same.

The former, GOAL EXPECTATION, is a measure of the likelihood of success for a side prior to kick off, based on historical data that is used to quantify the difference in quality between the sides. (It may even use historical expected goals data).

The latter, EXPECTED GOALS, is a value ascribed to the quality of attempts on goal, after the fact, based on the characteristics, shot type, location etc of each attempt.

The goal expectation of England and Scotland in the upcoming game is around 2.12 goals and 0.48 goals, respectively.

The expected goals for the game hasn't yet materialised.

Friday, 12 March 2021

XG as Easy as 1,2,3

One of the more interesting variants in the expected goals evolutionary backwater broke the scoring process down into stages. Most models go directly from shot location to goal/no goal output, but it is possible to include each of the possible outcomes.

A goal needs to jump through a variety of hoops to register (VAR excluded).

Shots can be blocked, they can miss the target, they can hit the woodwork or the can be saved before they enter the record books and each of these possibilities can be modelled separately.

This route isn’t inherently better than a single stage model, but it does help to throw a more descriptive, if not necessarily predictive light onto why and how a player is excelling or failing to convert location based chance quality into outcome based success.

It has been useful in trying to unpick the Brighton conundrum.

A plethora of underperformance has seen more blocks than expected from shots taken by Brighton players compared to an “expected blocks” model. This is further enhanced by the distance between blocker and Brighton shooter being the lowest in the league, they are getting closed down more extensively than any other team.

Which may suggest a slow and labored build up is degrading Brighton’s xG chances beyond what may be picked up by a one stop, rather than multi-layered xG model. Attacking tweaks, rather than patiently waiting for regression to kick in may be needed.

The next stage in the progression from shot to potential goal involves getting the ball on target.

One of the first xG think pieces I wrote for the now defunct OptaPro blog suggested that getting the ball on target wasn’t quite as straightforward a metric as it first appeared. In short, getting lots of shot on target wasn’t always the sign of an above average striker.

Robin van Persie, then of Manchester United was the guinea pig and his rather less than impressive rate of working the keeper with on target attempts didn’t seem to hurt his scoring performance.

The solution I suggested was that some players who aimed for more difficult to save areas of the goal, top corner, for example, might miss more frequently than players who prioritized target hitting at the expense of save difficulty.

In short, strikers shouldn’t be afraid to miss the goal.

So, we’ve run through two of the three xG stages.

Don’t get your shot blocked (that seems a universal aim, there seems a limited benefit in taking the ball so close to a blocking defender that the chances of having the shot blocked increases greatly).

Hit the target. A more ambiguous ambition. Most strikers could hit the target most of the time, but might compromise the difficulty to save their goal bound attempt.

The final stage is more akin to the traditional, one step model, but instead attempts that successfully negotiate the initial two stages are modelled against out of sample goal/no goal outcomes.

We’ve now got a multi-step xG model (that didn’t catch on from 2014), that adds tons of missing context that can be used to explain the “how” of why a player is returning the outcome from a location based process, even if it still falls to good old random variation to explain away much of the future performance levels.

Some factors affecting xG output may be systematic to teams or players (randomness is still the major player?) and by breaking the process down stage by stage, you can perhaps shine a light onto these additional factors.

Finally, here’s how over and under performers, with at least 10 regular play goals from shots only have maneuvered their way through the three stages of xG since 2016/17.

The table above includes diverse shooting profiles, which may be useful as a descriptor or potential as a coaching aid if the multi-stage xG model can pick up systematic flaws or talents that persist.

Jimenez avoids blocks at a league average, but then misses the target wantonly and his overall scoring from regular play with his boot falls way below the average expectation.

Grealish has more shots blocked than expected, misses the target more frequently, but runs a large over performance for goals scored. Placement is the likely culprit, here.

Whereas, Wood avoids blocks, hits the target, but tamely refuses to accumulate above average goal tallies.

It’s time to take data to the video booth.

Thursday, 24 December 2020

Stoke and the Art of Crossing

Stoke Highlight the Art of Crossing.

Two Stoke City games, two headers, two goals and a duo of 1-0 wins not only demonstrates the fine lines that can separate six points from two in a low scoring sport, such as football, but also the important role still played by crosses in the modern game.

Lavishly assembled squads may partly spurn crossing as a primary route to goal in favour of more intricate, possession based passing sequences to create space before the final delivery, but even the likes of Arsenal when faced with the need for a goal do fall back on the traditional cross.

33 crosses yielded a single goal in a recent 2-1 home defeat for Arteta’s side against Wolves and infamously, Manchester United attempted over 80 crosses in a drawn game with Fulham in the last days of David Moyes’ reign.

Crossing, as a primary strategy reached a low point with Liverpool’s 2011/12 team consisting of a big target man, Andy Carroll and a host of players ready to deliver a cross, led by Stewart Downing.

Unfortunately, such a predictable game plan & and tendency to cross the ball early from less advanced field positions, resulted in a failed experiment. An average of 21 Liverpool crosses per game was rewarded with just four Premier League goals.

Present day Liverpool lead the analytics revolution, but their failed, decade old legacy helped to kick start that revolution, as data was used to explain why their cross heavy approach failed and where the lesson lay for teams to maximize the returns from a wide player’s staple delivery.

Crosses in general are inefficient.

Leagues vary, but as a baseline number, it takes upwards of 90 crosses to score a goal directly from the delivery. Secondary chances created after the initial header or shot, but during the same phase of play, improves the strike rate to around one goal every 50 crossed balls.

However, not all crosses are equal. The danger is more apparent if a side works a delivery from the byline compared to a last-minute desperation hoof from deep into the mixer.

Fortunately, data can differentiate between types of crosses. Whether the ball was chipped or driven on the ground, for example. But where crosses originate and where they are aimed provides the biggest insight into how to turn a cross into a winning formula.

You can divide the origin and intended destination of a cross into two broad categories depending on how effective they are at producing goals.

In the graphic below, prime areas are shown in red and the least effective in blue.

Blue wasteful target areas are intuitive.

If the ball is aimed too close to the goal line, they become prey to a dominant keeper. But place the cross too close to the edge of the box and any shot or header will be taken from distance and for every yard a striker moved away from the goal, the likelihood of a goal falls by ten percent.

The red sweet spot is between these two areas.

The touchline hugging, wasteful blue delivery areas give both the keeper and defenders time to defend the box, whereas moving infield to deliver the cross reduces defensive reaction time and greatly improves conversion rates.

Hitting a ball from a wide and deep wing position to the wasteful area of the six-yard box, going from one blue zone to another, only produces a goal every 500 attempts. Whereas a delivery from a red, prime infield area to a red, prime area of the box increases conversion rates to around one goal every 20 crosses.

Stoke City’s two winning goals against Wycombe and Middlesbrough have been added to the graphic and hit the sweet spot for both Fox & McClean’s delivery and Collins & Powell’s headed goals. They were assists that were drawn from the most productive area of the crossing playbook.

Of course, there’s much more than “crossing by the numbers” to a successful outcome.

Powell is an accomplished header of the ball. During his Championship career over 20% of his goal attempts have been from headers and he is adept at getting on the end of higher quality attempts than the league average. Whilst Collins’ physical attributes are obvious.

Campbell then crossed from one prime area to another for Cardiff to obligingly smack the ball into their own net, before he departed on a season long, injury induced hiatus, Fox hit the prime red zone with a pacy cross to defeat Blackburn & Brown repeated the prime to prime connection to set up Thompson to briefly draw level with Spurs in the Carabao Cup 1/4 final.   

Clever off the ball running also contributes, a seen by Vokes drawing away Wycombe defenders with his near post run & Stoke creating an over load of far post attackers for the goal against Middlesbrough.

Over recent games, Stoke City had the crossing basics in place and good things followed,

On the weekend when Stoke climbed into the playoff spots on the back of two smartly executed crosses, Arsenal in the North London derby were again trusting more to luck by throwing in another 44 crosses in the vain pursuit of a goal.