Friday, 8 June 2012

Playing The Percentages.

Chris Anderson pens one of the top three US based soccer blogs at Soccer by the Numbers and on his Twitter feed this week he links to a Prozone Analysis of penalty shootouts from The Leaders in Performance website.No one claims authorship of the article Chris mentions,but it leans heavily on Prozone data and it looks at the statistical side of penalty shootouts,concentrating on conversion rates for types of shots and also the order in which the penalties were taken.I'd already read the post before Chris flagged it up on his Twitter feed and I was mildly unimpressed at first sight and even more so having re read the article.The link has also received a few skeptical queries from a few of Chris' followers.

The article is actually a very interesting read.It is jam packed with insight into the favoured technique of right and left footed players,the distribution of shots over the area of the goal and even delves into a shooter's run up.Great descriptive stuff.But it concludes with a fair number of headline grabbing conclusions by leaning almost exclusively on the use of percentages.

Any sporting stats piece is going to contain figures expressed as a percentage because it is a useful way to give a scaled estimation of something that you are measuring.However,when used in isolation of the raw figures,a percentage can be extremely misleading.Imagine a team that has taken 75% of it's available points.On the face of it the 75% figure appears extremely impressive,but unless we also know the number of events involved our confidence in the value of this percentage figure is unknown.

If we are talking about a team taking 75% of their available points over a whole season,then we know that we are almost certainly looking at a top two side.By contrast if the range is just 4 games then we could be describing any one of the following teams from the 2011/12 season.Arsenal,Manchester City or United Spurs,Chelsea,Fulham,Norwich,Newcastle,Swansea,WBA,Sunderland or Wigan.So by adding for example,the trivial information that we are actually describing teams taking 9 points from 12 rather than 85 from 114 we add immensely to the value and context of our statement.

The Leaders in Performance post mentions that they have taken data from penalty shootouts from every Euro and World Cup since 1998,but from then on they use percentages all the way starting with the reveal that 75% of teams who take the first kick go onto win the shootout.As in the previous example concerning percentage of league points gained by a team,the 75% figure initially impresses,but it is all but meaningless unless we also know how many shootouts we are taking about.The article does include enough details for the reader to discover the answer.Since 1998 there have been 16 Euro and World Cup shootouts,so the 75% figure is obtained from a 12 from 16 success rate for teams taking the first penalty and if those figures were presented along with the percentages we would have been in a much stronger position to judge the strength of the conclusion.

Better still a table listing the sixteen shootout decided games in the study would have enabled a reader to see if any of the teams kicking second may have been affected by factors other than a possible pressure to keep up.Indeed in the case of two teams kicking second in the sample,or more accurately the same country twice,a recognised penalty taker from that country was red carded during the match.That made him unavailable for the shootout,possibly compromised fitness levels of his teammates and may have altered substitution policy.Beckham and Rooney verses Argentina and Portugal respectively for those who haven't guessed the identities.12.5% or 2 from 16 of your sample events were probably atypical of normal penalty kick contests.

The post goes on in similar percentage based terms.The goal is divided into 9 sectors and 18.7% of successful kicks are apparently shot low and left to the corner.Even if you've taken the time to find out that the 16 shootouts contained 147 kicks,how do you even begin to make sense of the 18.2% figure.

Possibly the most flawed conclusion in the piece is the one that seems to most impress the writer.A "paltry" 14.2% of penalties are scored when a miss means immediate elimination,crumbling under pressure writ large.Again we're left to guess the sample size.14.2% reeks heavily of one score from seven and if indeed that is the actual numbers used for the post,we can start to judge if 14% is likely be an accurate longterm estimate for the success rate of such kicks under those specific circumstance,or whether we are looking at small sample size variation.Put plainly,how likely are penalty conversion rates to fall from 70%+ for regular takers to 60%+ in shootouts to a paltry 14% for " miss and your team is eliminated" attempts in much larger sample runs.

Shootouts themselves are rare events,they only happen in a knockout phase of a competition.Even rarer within shootouts are contests that extend well beyond the initial ten penalty kicks.The first shootout I witnessed live was at Stoke's Victoria Ground on a Wednesday evening in what was probably still called the League Cup.Manchester City beat the Potters after a shootout that had involved all ten outfielders from both teams.Hardly anyone missed and so the conversion rate for players taking the sudden death kick was 80%.Four successes and one failure,in this post's spirit of full disclosure.The last penalty contest I saw,on TV went two kicks longer than the Stoke/ManCity one from way back.Sheffield United and Huddersfield missed six of the first ten kicks,but then both teams were perfect until the luckless former Stoke keeper,Steve Simonsen missed with the game's 22nd spot kick.The paltry conversion rates of the 14% chokers would soon be transformed into clutch superheros if a few of these less likely events started to appear in small sized sample batches.

I've every sympathy with one man bloggers struggling to collect data to crunch,but a post on a site so dressed to impress as to call itself "Leaders in Performance" should be collecting beyond a sample size of 16.If they'd gone back another thirty years they could have doubled the Euro and World Cup sample size.They could include other national team competitions that are settled from the spot and top class club knockout games.Or they could have presented the piece as descriptive rather than a tool to predict future performance.

But most of all they should have added the raw numbers to the percentages.


  1. just read the link you's a shocker isn't it.The kind of thing that gives soccer analytics a bad name.Whenever I see a post with nothing but percentages i'm always suspicious.Good work!