Monday, 12 August 2013

Towards A Better With or Without You.

In the previous post, I looked at the the growing trend to attempt to evaluate the impact of a single player by referring to game results when he takes part in a match or is wholly absent either through injury,non selection or suspension. Superficially the methodology appears sound, if rather crude. However, on closer inspection the pitfalls are both numerous and largely insurmountable.

By taking such an approach we trying to demonstrate how good a single player is by comparing the difference in team performance in games where he is absent, but replaced by another (not necessarily the same) player, who may play against widely diverse opponents, surrounded by a similar, but often varied collection of colleagues. And the difference is measured in that rarest of footballing commodities, namely goals. Often the exercise runs over a single season, a period that is often even insufficiently large to accurately demonstrate the existence of such universals as the home field advantage enjoyed by a side.

The twin terrors of opponent strength and small sample size can be illustrated with a contrived, extreme example. Often universal problems that occur to varying degrees in reality can be highlighted by use of the unlikely, but possible scenario.

A side wins the league in a canter, winning every match, bar one where an early red card led to a narrow defeat. For the final match, the ever present star is rested to the stands and watches as his team is the early beneficiary of a red card decision and run out easy 6-0 winners.

A raw with/without percentage comparison will give his side a better win or success rate or goal difference in that final game compared to the previous 37 matches when he was a confirmed starter. A yardstick derived from one match is (hopefully) obviously inadequate, but similar problems arise when drawing conclusions even from 10 or 15 game samples.

The fundamental idea is fine, but the current application of the method is awash with noise, leading to little worthwhile conclusions.

Spurious, title related, pop image.
To demonstrate how this approach may be improved it may pay to look to the NFL, where parity is slightly more keenly experienced than in the EPL and player contribution to useful, game defining events is more readily apparent because of the play by play nature of the contest.

An NFL offense takes part in around 60 individual plays per game over the 16 match regular season. None of the 11 players on the field during each play is a passive spectator, as can sometimes be the case in football. Every NFL player has a designated contribution to make on a single play, even if it is merely diversionary and takes place far from the area target by the ball. Furthermore, the success of each play can be quantified, most readily, if slightly unsatisfactory, in terms of raw yardage gained or lost or alternatively, in terms of first downs achieved.

So by looking at the NFL, the concerns around quality of opposition and small sample size is partly addressed. Further, by looking at unique lineups, rather than plays that took place with or without certain players, we can avoid the problem whereby the remaining 10 players may also change identity. And finally if we use the least used lineups to compare against the most used lineup, we can begin to see the likely ranges of the difference in performance at a team level. This may give a ball park figure, when scaled down to individual player levels, of the kind of differences we are (naively) attempting to quantify in a sport with similar levels of professionalism, such as soccer.

More frequent use of the same 11 players on offensive plays, appears to be quality driven. The less man games a side lost to injury the more they use the same players on a single play and on average the more successful they were measured against their recent achievements. So it is not too big a leap to suggest that unique formations that occurred most are likely to represent the near cream of a side's playing staff that year and much less used combinations are more likely to contain lesser players. Visually this also appears to be the case.

Average Performances of Most and Least Used Lineups in the NFL.

Offensive Lineup. % of Total Plays Involving Lineup. % of  Total Yards Gained. % of First Downs Gained.
Most Used. 6.3 7.0 6.9
Least Used. 10.8 9.7 10.0

(Most Used lineups took part in 2100 plays, the least were used in 3800 plays over the 2012 season).

Above I've averaged the outcomes of plays made by the least and most used lineups for every NFL team last season. Success has been quantified both in yardage and first down terms and the percentage of plays each composite unit was involved in is included to give context to the opportunities given to each group.

Fuller context is missing, but the broad picture hopefully matches reality. The most used, presumably more star studded offensive lineups produce slightly more of a side's overall yardage and first downs than their share of the plays would suggest. The least used lineups took 3800 snaps or 10.8% of the total experienced by NFL offenses in 2012 and produced slightly below that % of first downs or raw yardage. 

Arguably, a single season of matches for every NFL side, using nearly 6,000 individually quantified on field plays has managed to show a (small) difference in performance levels between what may generally be described as the generic "best" 11 man lineup and the less favoured ones.

In hindsight, attempting the same for individual soccer players, from individual teams, using just 38 total trials, decided by rare events, should possibly be considered a tad optimistic. The quality of players in the EPL is undoubtedly high, but the difference in quality between interchangeable colleagues in the same side is likely to be very small. Expecting this difference to show itself in match results over a relatively limited timescale, especially with the attendant, unaddressed baggage, is unrealistic.

At the very least we should be looking at using more frequently seen individual or team events, such as successful passes or chances created rather than placing so much faith in match outcomes and accept that we are trying to measure differences that, in individuals, at least is very likely to be swamped by noise. 

No comments:

Post a Comment