Tuesday, 6 August 2013

Predicting The Rare From The Commonplace In Football.

Each sport has an event or series of events that have a disproportionate effect upon the outcome of the match. The ability to create chances in soccer is an obviously vital factor in determining match result. They are the precursors to goals and goals are the ultimate arbiter of the successful, defeated or stalemated in soccer.

The ability to score goals, as Blackpool most recently demonstrated, fulfills only half of requirement to be successful. Defence is also important and the Seasiders 55 goals equaled the tally set by Tottenham, but saw them finish 15 places below Spurs because of their 78 scores conceded. The general case is still fairly strong, the more goals a team scores, then the higher up the league you tend to finish, but for the complete picture, we also have to look at defensive performance as well..

Past performance is often an indicator of future achievement. Managers and players invariably come and go, bringing changing skills and tactical development, but a sizable rump of the previous team often remains and previous performance levels still explain at least part of what we may see in the future.

In the previous post I looked at the balancing act between the limitations of using direct comparisons between significant events and instead gathering more copious amounts of data by moving a stage or two back in the process or incorporating more numerous actions that require similar skills to the perform the key acts that we wish to project. Relatively rare goal scoring may be better predicted by examining the more frequent assists from where they originate and assists themselves may also have a more accurate predictive ancestor.

The ball controlling nature of the NFL makes for a much better testing ground for the use of more numerous, secondary events as better predictors of rare, but important, game changing events. Numbers of possessions is almost always equally shared between sides in the NFL. So the effective use a side makes of that possession is a decisive factor in determining the result.

Turnovers, whereby one offense hands the ball over to the other defense (and hence onto the opposing offense) without scoring are extremely difficult to overcome in a single match. Drives are time consuming and teams can ill afford to pass up scoring opportunities through their own carelessness and hand an "extra" one to an opponent.

Interceptions are the most straightforward of turnovers. The quarterback throws to an intended receiver, but a safety or cornerback, most usually, intervenes and catches the ball instead. Possession lost, often along with the game if the process is repeated and the side ends the match with a negative turnover differential.

Interceptions in the NFL are rarer than goals are in soccer. A typical side picks about 16 errant throws a season or an average of one a game, split between game chasing desperation throws and game changing miscues. But their effect on the game result is often so profound that it is extremely useful to have as accurate an estimation of future performance as is possible.

The tried and trusted route of relying on previous performance to predict future outcome doesn't provide a strong relationship. We've already noted that teams change from year to year, but coupled with the small sample size, interception numbers provide scant clues to future intercepting potential. So can improve the strength of the predictive relationship if we instead look to a precursor to interceptions that require similar skills and, crucially occur in much greater numbers?

A pass thrown presents the opportunity for the receiver to make the play, the ball to fall incomplete, the ball to be successfully intercepted or the defender, by his presence to get close enough to the intended receiver to knock the ball away. The latter, a so called defensed pass is a close relative to the full interception and a side records around 90 such plays a season compared to just 16 for interceptions.

Do Defensed Passes Better Predict Future Interceptions?

NFL Team. Actual Def. Interceptions in 2011. Predicted Def. Interceptions in 2011. Actual Def. Interceptions in 2012.
Arizona. 10 18 22
Atlanta. 19 17 20
Baltimore. 15 22 13
Buffalo. 20 15 12
Carolina. 14 15 11
Chicago. 20 14 24
Cincinnati. 10 16 14
Cleveland. 9 15 17
Dallas. 15 13 7
Denver. 9 13 16
Detroit. 21 16 11
Green Bay. 31 20 18
Houston. 17 20 15
Indinapolis. 8 11 12
Jacksonville. 17 14 12
Kansas City. 20 18 7
Miami. 16 14 10
Minnesota. 8 11 10
New England. 23 14 20
New Orleans. 9 19 15
NY Giants. 20 18 21
NY Jets. 19 15 11
Oakland. 18 18 11
Philadelphia. 15 14 8
Pittsburgh. 11 16 10
San Diego. 17 14 14
San Francisco. 23 21 14
Seattle. 22 17 18
St Louis. 12 14 17
Tampa Bay. 14 13 18
Tennessee. 11 16 19
Washington. 13 16 21

What we lose by no longer comparing like with like, may be compensated for by a substantially larger pool of data that describes an associated skill. The relationship between defensed passes and interceptions is relatively strong and in the expected direction, so they are likely to be the product of similar player talent. Therefore using data from 2006 to 2010, I calculated the number of interceptions each side would have expected to make in 2011 based on their number of defensed passes by their defense.

For example, Seattle's defense claimed 22 interceptions in 2011's regular season, but based on the number of passes they managed to physically touch and disrupt during that season, the league average regression from 2006-2010 suggests that they probably over achieved by around 5 interceptions. A league average team managing Seattle's 105 defensed passes would have more likely just grabbed 17 picks. The full list for each team in 2011 is in the table above. 

The discrepancy between the actual figure and the pass defense predicted figures for 2011 can be explained in two ways. Either sides which differed greatly from the prediction got lucky (in a good or bad way) or they were in reality better or worse than the league average. Or much more likely a combination of the two.

As with all recorded stats the skill component tends to persist across time, but the random luck does not. Sides that persistently under or over-perform may be exhibiting their true deviation from the league average skill levels, but the temptation is to see any variation from the average to be fully resulting from different levels of real talent and disregard random variation as even a partial cause. 

For the purpose of this rudementary trial, we can use a single season to see if the underlying talent needed to intercept a pass is better represented in more numerously defensed passes than in relatively rare real interceptions. If there is more signal than noise in an average of 90 defensed passes than there is in 16 interceptions, this should show up in how well each set of 2011 figures predict figures for 2012 for each team. 

Formally, neither actual nor predicted 2011 interceptions are strongly related to defensive interception performance in 2012. Team churn and rarity of the event almost guarantees that. However, the amount of interceptions predicted from passes defensed in 2011 is closer to the actual 2012 figure in 22 of the 32 cases. One honourable tie leaves previous actual performance the better indicator of year N+1 interception performance on just nine occasions. 

If you wanted to predict interceptions in 2012 by looking at interceptions in 2011, you were not looking in the best place.

A couple of new signings can partially change a side's likely stats.
The important events in a sport are often just special cases of a more general talent. A player who can pass accurately in the final third in the Eredivisie, will as likely also be able to provide a key pass that creates a chance for a colleague. Therefore, final third passing ability because of it's relative frequency may well be a much better indicator of a player's ability to create chances than the limited occasions on which he has actually done so over previous seasons.

Correlations will never approach the levels of certainty to which visual evidence often tricks us into believing is possible, but exposing noise among the signal and vice versa is always a satisfying advancement of the raw, often deceptive figures.  

No comments:

Post a Comment