Pages

Tuesday, 3 April 2012

Time of Possession in the EPL.

Time of Possession is fast becoming one of the hot topics in the world of football stats and such is it's current pre eminence that it has become almost the number by which a team is judged.Barca of course are the darlings of ToP and such teams as Arsenal and Swansea are praised for their accurate passing style that also boosts their time in charge of the ball.Mystification along with a faint hint of disgruntlement accompanies any example of a side who win the game's ToP,yet fail to win the game.Hopefully this post will make a convincing argument as to why you should expect a significant proportion of ToP winning teams to actually lose the individual game and why game long ToP shouldn't even be considered as a "real" stat.

Teams use possession for a variety of reasons,some defensive and some attacking in nature.The temptation to stereotype teams on the basis of one particular trait should be resisted,but it's hard not to see Barca and Swansea as polar opposites in possession terms.The former use possession and a host of world class attacking players to score goals,whereas The Jacks use it to prevent their opponents from scoring.Swansea held scoreless the first four EPL teams to visit The Liberty,picking up 8 points in the process and almost 40% of their games has seen their goal remain intact.Tellingly,they have also failed to score themselves in a similar percentage of matches.

On average most other teams use possession in a way that doesn't emphasis extremes of attacking or defensive philosophies and in addition current score can influence how teams accrue or cede the ball.A side who hold a comfortable lead may add to their possession statistics by passing the ball around in less congested areas such as defence or another may happily defend in depth and in numbers,but in doing so concede possession.In short the ways in which a team can decide to try to increase or reduce their time on the ball is dependent upon game situation,are numerous and often depend upon team tactics and there in lies the fatal flaw of trying to interpret ToP.

It is a secondary statistic that has been derived from a composite of many other primary stats that are fundamental to a team's success,often opposing in nature and quite often related to the talent pool of individual sides.Teams get possession by being good at making tackles or interceptions,they keep it by passing well,they use possession to create chances,they score goals by converting chances and they win games by scoring more goals that they allow.So we have a chain with match result at one end time of possession at the other,but in between are statistics that are ability based,context dependent and often better indicators of match success.

ToP is an amalgam of other more informative numbers,so why look at ToP when you can look at the components that go towards defining it.

If we use shot per game allowed and conceded as an example of a fundamental football statistic,we can show why this is a much more powerful measurement of a team's ability than time of possession even though we can use both to come to very similar conclusions about a team's ability.It's reasonable to assume that for the vast majority of teams,increased possession will lead to more goal attempts by that team and less from their opponents and this appears to hold true over a 38 game season.As we can see,the differential between attempted shots and shots allowed becomes positive and increases in size as possession passes 50%.


























If we now chart a team's seasonal success rate,that is it's number of wins plus half it's number of draws all divided by it's number of games against average shot differential over the season we can see that the correlation is both strong and in the expected direction.Each data point represents one team's season consisting of 38 games,or 31 for the present year.



























Team possessions plotted against success rate produces a similar graph,although the correlation is not quite as strong.The similarity isn't a surprise as we have speculated that one of the aims of a team while in possession is likely to be shot creation for itself and prevention for the opposition.


So if we now use both of these graphs to work through a real life example we can demonstrate the different levels of information we can deduce from a derived stat such as ToP and a talent based one such as shots per game.In 2010-11 Newcastle's average ToP was virtually 50% and it's shot per game differential was zero.So on average they kept the ball for as long as their opponents and on average they attempted as many shots as they allowed.From each of the above graphs we therefore can predict that their success rate for the season would have been 0.5.However,it was only 0.46.

Now if we came to our conclusion that Newcastle's actual results under estimated their probable ability based on historical results from the shots per game model,we can look back at their 2010-11 matches to see if they were short changed by short term luck during the season.And we find that their success rate in games where they were out shot by their opponents was 0.37.They were outshot on average by 9 shots per game to 14,comprising 20 games and based on the season long graph we can possibly say that they very slightly overperformed in those 20 matches.In the remaining 18 games,Newcastle out shot their rivals by 14 shots to 8 and their success rate was 0.55.An improvement compared to when they lost the shots per game duel,but probably well below what an average team would expect with such a record.Indeed if we look at individual games,Newcastle managed to lose five games where they outshot the other teams.A quarter of such contests.

We can explain such situations were a team has more shots,but loses the game thus.Teams will have an historical average rate at which they turn shots into shots on target and ultimately goals.On a particular matchday these rates will fluctuate around their historical means because one game represents a very small sample size,especially in a low scoring sport,such as football.It's therefore to be expected that the vagaries of short term luck will in a season's worth of games see some teams being overly unfortunate in the number of games where they have more shots in a single game,but end up scoring less of the goals.That may have contributed to Newcastle's lower than expected success rate based on their shots per game stats.

By using ToP we can also deduce that the Toon should have had more wins or less losses in 2010-11,but we have no way of knowing which factors may have contributed to the short fall.Even if we isolate games where they dominated possession and lost,we still don't know if they shot inefficiently,played Swansea style keep ball and succumbed to a late goal or just made individual errors.ToP tells us nothing that other fundamental game stats won't tell us with more illumination.

If we can put ToP into context it may become a valuable addition,but in it's present catch all,game long form it struggles to compete.

for more great thoughts on time of possession,check out 11tegen11 and 5 AddedMinutes.

8 comments:

  1. Nice one mate! Great analysis! :)

    ReplyDelete
  2. Interesting analysis but I have a couple of points.

    How can you say that the shot differential vs success rate is better correlated than possesion vs success rate? By eye it is (argualbly) so but the x-axis is different for each. A simple r-squared value would give an easy comparison of correlation.

    Also, saying Newcastle underperformed in 2010-2011 because their success rate was "only 0.46" rather than 0.5 misses the fact that, when analysing scatter plots, one should consider uncertainty in an assumed best fit plot. I would wager that Newcastle's success rate of 0.46 was within errors.

    Finally, and most pedantically, "it's" means "it is". I think you meant "its", meaning "belonging to it.

    Overall, I think this is very interesting and it would be interesting to see a plot of "shots on target differential" vs success rate. In the end, though, there are so many variables (such as tendency to shoot from distance, quality of shots, etc) that a simple analysis such as this can only be used properly in conjunction with other information.

    ReplyDelete
  3. Interesting analysis but I have a couple of points.

    How can you say that the shot differential vs success rate is better correlated than possesion vs success rate? By eye it is (argualbly) so but the x-axis is different for each. A simple r-squared value would give an easy comparison of correlation.

    Also, saying Newcastle underperformed in 2010-2011 because their success rate was "only 0.46" rather than 0.5 misses the fact that, when analysing scatter plots, one should consider uncertainty in an assumed best fit plot. I would wager that Newcastle's success rate of 0.46 was within errors.

    Finally, and most pedantically, "it's" means "it is". I think you meant "its", meaning "belonging to it".

    Overall, I think this is very interesting and it would be interesting to see a plot of "shots on target differential" vs success rate. In the end, though, there are so many variables (such as tendency to shoot from distance, quality of shots, etc) that a simple analysis such as this can only be used properly in conjunction with other information.

    ReplyDelete
  4. Hi ollie,
    r2 is greater for the former than the latter,but too much maths tends to put off the casual reader.Thanks for the grammar lesson,I'll sack the proof reader :-)

    ReplyDelete
  5. Thanks for the reply Mark (my PhD supervisor was a pedant so it was drummed into me!).

    Sorry for posting twice - I'm in Egypt and I can't read the Arabic notices that come up!

    Great read though.

    ReplyDelete
  6. I love these analytics but you guys never reference the source data. Where can we find it?

    ReplyDelete
    Replies
    1. Hi Tom,
      a lot of the stats are scattered all over the net.Soccerbase for example has team yellow cards,goal times and substitutions,but not in readily usable formats.EPLindex is an opta re seller,again some very detailed figures for a small monthly charge,but the data needs to be put into a usable format.Google usually finds the more obscure stuff if it's out there.
      m

      Delete
  7. Like the content of your article!

    I have been doing some research on calculating fair odds by expected goals, poisson distribution and placing value bets lately and would like to recommend the following articles (for beginners like myself):

    1. http://footytradingposts.blogspot.co.uk/2012/07/calculating-goal-expectancy.html
    2. http://footytradingposts.blogspot.co.uk/2012/01/poisson-for-dummies.html
    3. http://www.soccerwidow.com/betting-maths/tutorial/calculation-of-odds-probability-and-deviation/

    ReplyDelete