Liverpool’s
bilingual mastermind behind the team’s meteoric rise to dominate club, domestic,
European and now world football is gradually gaining a higher media profile.
Not Jurgen
Klopp, although he has played a part in the Red’s success, but Dr Ian Graham,
their current director of research.
Ian’s
recent appearances in both the spoken and written media has not only
highlighted the importance of an integrated approach to squad building that
utilizes a data driven approach, alongside more traditional methods, it has
also given a small glimpse into the analytical methods employed.
The latest
profile landed courtesy of Liverpool.com and described some fundamentals of
Liverpool’s analytical philosophy.
One
particularly resonated with Infogol’s
approach of quantifying every footballing action in the same currency of goals
or more specifically x goals.
The idea
that every action, be it a pass, tackle or long throw changes the likelihood
that a side will ultimately score isn’t a new concept.
It was
probably first introduced into the public analytical domain by Dan Altman in his whistle stop
OptaPro presentation in 2015 and hints of such models have been recently
emerging from Opta itself and Twelve football.
Such a
non-shot xG model also powers Infogol’s “Team of the Week”.
The gradual
migration, at least inside the industry, from a purely chance based evaluation
to a more holistic one somewhat mirrors the earlier transition from merely
counting shots, as exemplified by total shot ratios from 2008 to a more
informative, location based xG model, subsequently.
However,
creating such non-shot models that quantify every on-field action is not a
simple task. The granular data required to build non-shot models dwarfs that
that was needed to create TSR, which itself was rudimentary and basic compared
to that required to create a proficient xG model.
These leaps
in data driven evaluation presents a dilemma for the aspirations of public and
hobbyist analysts, an area that provided much of the driving force behind the early
explosion in football analytics.
Latterly,
monetization of ideas and a larger appetite for quantitative metrics to
supplement opinion driven insight in the media and clubs, has swept many of
those same hobbyists behind a non-disclosure paywall.
Less
co-operation, dwindling numbers, availability of adequate data and the need for
diverse technical skills to process that raw data, appears to have stifled the
growth of football metrics in the purely public arena.
At the risk
of falling victim to one of Twitter’s sloganized insults, “back in the day,
metrics didn’t last long before they were improved upon or supplanted
altogether”.
Liverpool.com
suggested that Ian’s weapons grade model might be broadly replicated by current,
readily available and much quoted metrics, such as xG Chain (I’ll let you
google the definition).
Succinctly,
the metric rewards every participant in a move that ends in a goal attempt with
that chance’s entire xG.
The
distribution of goodies can seem churlish, for example, by giving far less
individual credit to the three Middlesbrough players who swept nearly the
length of Stoke’s defensive transition to score a low probability winner on
Friday night, as it would a marginally involved square ball on route to a
multiple passing move that ends with a tap in from six yards.
More
crucially it completely omits actions that aren’t concluded by a created
chance.
To test
Liverpool.com’s optimism, I compared Infogol’s non-shot ball progression via
passes and carries to the much-touted gold standard of xG Chain.
To avoid
confusion over units, I’ve simply ranked the xG Chain and the non-shot ball
progression for each player in the recent Merseyside derby and then compared a
player’s rank in one metric with his rank in the other.
Shaqiri
ranked an impressive 2nd overall in ball progression, but a lowly 16th
in xG Chain, whereas Origi rates highly by the latter, but much less so in the
former.
Overall, a
third of the players have double digit ranking differences between their pecking
order in both metrics. There are some agreements, but the relationship between
the two metrics is generally weak.
Extend the
study to every game played last season and this tenuous correlation between the
two metrics remains.
One of the
strengths of the early analytics movement was the ability to sift mere
statistical trivia (team Y has recorded X when player Z plays, immediately
springs to mind) from useful, if imperfect evaluations that convey insight and
can be used to both evaluate and project future performance.
A great
example of the latter is Dan Kennett’s
recent Allisson tweet, which used big chances to highlight the keeper’s
importance to Liverpool, both in the past and possibly in the future.
Save rates
when faced with Opta’s Big Chances can be framed to be a very good proxy for a
more exhaustive and granular, post shot xG2 modelling of a keepers saves and
goals allowed.
Dan’s tweet
was selective, but also carefully constructed enough to capture the keeper’s
core attributes. Current retweets are approaching around 10 billion!
That should
be the benchmark for widely used metrics and player contribution figures, such
as xG Chain fail that test on numerous counts.
It fails to
differentiate individual contribution, omits larger swaths of creditable
actions and thus fails to correlate well with more exhaustive modelling of a
similar player process.
The
challenge for the public arena as we enter the roaring 20’s is to come up with
constant improvements to substandard and potentially misleading measures….. and
be more like Dan.
No comments:
Post a Comment