Monday 31 March 2014

Kings and Kingmakers.

If Sunday was a fine day for Merseyside as a whole, Saturday was also almost as good for both halves of the city, despite neither side setting foot on a football pitch.

Crystal Palace against Chelsea, pitted a master of tactical planning, prepared to sacrifice every ounce of entertainment from a match to achieve his desired result against.......Jose Mourinho. Ultimately, Pulis' continuing love affair with a binary scoreline could prove decisive in turning Palace's survival from merely possible to probable. However, the loss of three points in a 1-0 defeat, stalled Chelsea's pursuit of another Premiership title, just when even Jose was daring to dream.

The result of the evening fixture was almost as good for Liverpool, as close rivals Manchester City found a visit to the Emirates as difficult as the respective league positions suggested it should be. Arsenal, with UCL football now in the balance, stirred enough to deprive City of two points and, for the moment, fight off a sustained challenged for 4th from Merseyside's Blue half.

Wins for both Everton and Liverpool the following afternoon, then pressed home the advantage that Saturday's results had delivered. Radio 5 Live's 606 phone-in had a decidedly red hue, with outlandish comparison being drawn between Brendan Rodgers and Shanks and honeymoon flights being cancelled to take in the clash with Manchester City in April.

A bunched finish is more often a sign of overall mediocrity, rather than excellence, so the accolades could perhaps be shelved for a few years, at least. But there is no denying that Liverpool are an improved side and an attacking style that dares opponents to keep up, rather than a pragmatic, defensively stifling approach has captured the imagination of fans beyond their natural heartland.

Chelsea and Liverpool each has six games remaining, City has eight, but the best team doesn't always gain the most points over a short span of matches, as luck and strength of schedule effects kick in. The highest placed side after half a dozen games, more often than not doesn't stay the 38 match distance, so a sprint finish from a reasonably level break, combined with a visit to Anfield for both City and Chelsea will likely see the title odds fluctuate throughout April.

An uninterrupted run of victories would ensure the title for either City or Liverpool, but, in the likely event that a few Palace-like slip-ups lie in wait, the importance of home matches for Liverpool against first, Manchester City and then a fortnight later, Chelsea is clear.

How Results Against Manchester City & Chelsea Could Decide the Title Race.

Outcome of H2H From Liverpool Perspective. % Chance of Liverpool Winning EPL.  % Chance of Man C Winning EPL. % Chance of Chelsea Winning EPL.
MC(W) & Che(W). 71.5% 27.0% 1.8%
MC(W) & Che(D). 36.0% 48.0% 16.0%
MC(W) & Che(L). 24.0% 42.6% 33.4%
MC(D) & Che(W). 37.1% 59.5% 3.4%
MC(D) & Che(D). 17.0% 70.5% 12.5%
MC(D) & Che(L). 7.0% 58.9% 34.1%
MC(L) & Che(W) 11.8% 86.8% 1.4%
MC(L) & Che(D). 2.1% 91.1% 6.8%
MC(L) & Che(L). 0.6% 78.5% 20.9%

One way to attempt to quantify the importance of the upcoming games is to simulate the remainder of the season and examine the frequency at which each side tops the table under the nine possible combinations of match outcomes for the two huge games at Anfield.

We should have sufficient information to be able to frame reasonably accurate match odds for the remaining 65 matches and use these probabilities to re-run the EPL table without the need to play the matches. Around one season in five, Liverpool might achieve victories over both City and Chelsea and in light of the boost given to their chances on Saturday when their rivals dropped points, such a clean sweep would see them claim the title just over 70% of the time.

As a slight reality check, every combination of results other than a Liverpool clean sweep in these two head to heads makes a Liverpool title odds against. Manchester City still achieve more title winning simulations than Liverpool, even if they lose at Anfield, but Chelsea avoid defeat two weeks later. Chelsea also increase the size of their minority vote with a favourable Anfield experience.

So Liverpool's advantage from playing their closest rivals also has the downside of increasing their strength of schedule in the remaining games. Should City lose to Liverpool, they will have a motivated Chelsea going into bat for them two weeks later. 

A fascinating end to the season, with rivals possibly relying on each other to beat whichever side steals a march towards the title. For all the hype surrounding Manchester City's visit to Liverpool in April, it may be their return to the city in May, this time to Goodison Park, that finally clears up the 2013/14 EPL title race. 

Wednesday 19 March 2014

Judging A Side Over One Season.

For much of Alan Pardew's reign at Newcastle, his side has provided ample topics of discussion for the analytical community. Their fifth placed finish in 2011/12, with a goal difference of just 5, was widely seen as the product of a decent side getting a bit lucky and their near relegation the following season merely served to reinforce some misconceptions that exist around luck based over-achievement immediately snapping back into ledger balancing under-achievement.

This season the team has largely occupied the dead ground between Simon Gleave's Superior Seven and Threatened Thirteen. They currently hold 8th spot with very little likely up or downside, casting them as mere spectators to the title and Champions League tussles and the relegation grind. So it has been left to their off field employees to guarantee column inches.

Over the previous two and three quarter seasons, Newcastle's success rate, that counts draws as half wins, has averaged out at a shade over 50%. If the Premiership were allowed to meander on without a season's end reckoning, Newcastle since the start of 2011/12 have recorded results typical of a side finishing a 38 game season in 8th or 9th spot, with a goal difference in the region of zero. So, despite the bouts of squad strengthening and weakening during the various windows, might Newcastle have been broadly a "best of the rest" team under Pardew and has their seemingly up and down seasons simply been down to talent and luck interacting in differing doses?

Simulating season long results for one side or a whole league can quickly produce the range of possible outcomes we might expect from teams of a certain quality and it is perhaps a sobering thought, especially for managers that there is a non trivial chance that the best side from the Threatened Thirteen could find itself struggling to top 40 points over a 38 game campaign.

There is almost a 10% chance that a side with a true success rate of 0.5 may recorded 41 or fewer points, the limit of Newcastle's achievement in 2012/13 and a 6% chance that they could rise to the highs achieved in the previous season.

So although a dissected break down of Pardew's previous two completed seasons would give an entirely factual record of the points per game performance of his side in the 2011/12 and 2012/13 seasons, we can be much less sure if that see sawing record accurately tracked the talent of the team.

What we see, isn't a cast iron endorsement of quality or the lack of it, but a combination of randomness and talent.

It would have been easy to over react to the raw results posted by Newcastle in either season and while injury, Europa League involvement, squad churn and tactical change almost certainly impacted on their talent and luck based achievements, it seems unlikely that the 24 point difference between the 2012/13 team and the 2011/12 team truly expressed the quality gap between those two closely related editions of the side.

One way of trying to reconcile an apparent points based performance improvement to any real shift in a sides actual abilities is to utilize to expertise of the bookmaking industry. At the start of the season all of the spread firms publish the expected number of league points they think each side will gather by the end of the season. As the season progresses these estimates will be up or downgraded in line with each side's current rate of accumulation of points.

For example, prior to last Saturday's eventful head to head meeting with Newcastle, Hull had already posted 31 points, almost equal to the general preseason estimate of around 33 or 34 points. Their updated quote suggested they might get about 42 points. So from a performance perspective they were roughly 8 points ahead of where most bookmakers expected them to be.

We have also reached the second half of the season, where a side increasingly completes their head to head appointments and by mid March a team has played a rival home and away between 9 and 11 times. The chances of a team winning or drawing a game depends on not only the gap in quality between the teams, but also the venue. So the expected success rate in a return fixture will depend partly on how each team has progressed or regressed over the 190 or so days between the initial meeting, but also where each match was played.

Above, I've plotted the pre-game quoted success rates for actual Premiership matches, paired by home and away matches. So a home side that is expected to have a 60% success rate at home would likely be quoted at around 40% when they traveled to complete the reverse fixture.

If we look at the actual differences in the quoted success rates for return matches any deviation from this fairly strong, general relationship can be, at least partly attributed to the direction both sides have moved in over the ensuing timescale.

The case of Crystal Palace (or more accurately, Tony Pulis) demonstrate the process. Up until March 19th, they have played 10 teams twice. Stoke, Hull, Manchester United, Norwich, WBA and Southampton each made a return visit to Palace and Palace set off to reacquaint themselves at the homes of Arsenal, Spurs. Sunderland and Swansea.

In eight of those ten matches the quoted pregame success rate in the return was greater than predicted by the success rate in the initial game and the line of best fit from the plot above. Palace's quoted success rate posted by the bookmakers is on average 15% higher over the ten rematches, compared to a typical example from the line of best fit. In short, the recent quotes imply a level of improvement compared to Palace in the opening months of the season.

There are strength of schedule issues, Palace may have played ten teams whose true ability has dipped, but this can be addressed by a least squares approach to the overall loss or gain in success rate of each side, when measured against their actual repeat schedule. The order shuffles slightly if you apply this correction.

Which Teams Have Impressed Enough to be Upgraded from Preseason Estimates?

Team. Average Shift in Success Rate per Game.
Crystal Palace. 0.06
Liverpool. 0.04
Hull. 0.04
Arsenal. 0.04
Manchester City. 0.04
Chelsea. 0.03
Everton. 0.02
Newcastle. 0.01
Sunderland. 0
Southampton. 0
Manchester U. -0.01
Spurs. -0.01
WBA. -0.02
Norwich. -0.02
Stoke. -0.04
Swansea. -0.04
Cardiff. -0.04
Aston Villa. -0.04
Fulham. -0.05
WHU. -0.06

The table above shows the strength of schedule corrected, average raw success rate increases or decreases per game as seen in the repeat fixtures played so far by each Premiership team. It's a combination of perceived improvement (Liverpool) or decline (Manchester United) and partly a popularity contest.

Stoke are on course to better most preseason points estimates, but their marmite factor makes them a universally unpopular team, which may account for their apparent downgrading, despite a better than expected transitional campaign. Even punters have to be offered inducements to side with Stoke! But hopefully, most sides in the table have been up or downgraded predominately on the bookmakers interpretation of repeatable form.

To add context to the previous table, I've finally plotted the change in pregame quoted success rate against the rise or fall in expected final league points totals compared to the estimates at the start of the season. Fulham, for instance are on course to be 10 points adrift of preseason estimates and they've been downgraded by an average success rate of ~0.05. Slightly more than you would expect from the plot.

Palace are on course to gain about 36 points compared to the 31 quoted in August, but their position, well above the average line appears to reflect extremely well on current manager, Tony Pulis and less so on Ian Holloway, who was in charge for the bulk of 2013. Pulis, once again demonstrates an ability to turn relegation fodder into lower half scrappers who can take their season to the wire. When he was appointed Palace had 4 points from a possible 33 and were rock bottom.

Steve Bruce's tenure at Hull is also endorsed by the betting movements. Not only were the Tigers underrated, but there appears cause for added optimism above and beyond there current position of relative safety.

Both Merseyside teams have gather more points than expected, but Everton's upgrade still lies below the line of best fit, possibly indicating a reluctance to take their improvement wholly at face value. A caveat that also applies to Southampton.

Liverpool may be the most telling point on the plot. They were expected to gather around 66 league points when the season began, currently they may be able to top 80 points. An increase of around 16 points. However, if we roughly apply their average increased success rate that is currently being applied to them over a 38 game season, their points total increases by just around half of those 16 points. Outstanding, but partly unexpected performance isn't being taken at face value on a match by match basis. It is also being taken with a slight pinch of salt.

The Reds' uptick is as spectacular as the 38 game epic demonstrated by Newcastle, when they nearly claimed a Champions' League spot, but it may be well to remember that single season performances are a product of repeatable skills and also less repeatable variation. Even with points in the bag, a team might not really be quite as good (or bad) as their record might indicate.  

Monday 10 March 2014

Analytics Might Need to Get Lucky.

Let's try a thought experiment.

Analytics has taken root in the game and you've been tasked with improving the goalkeeping abilities of one of your two goalies. Both are equally talented at every aspect of the game, (you've developed a statistical method that strips away random variation to tell you this, but of course that must remain proprietary). The two keepers are identical in virtually every way, both on and off the field. They are the Fabio and Rafael of the goal keeping ranks.

But there is one facet of the game where a slight improvement may be possible. Namely, saving penalty kicks.

You load up a database with physical and locational data for every type of penalty kick. The pace of approach, the angle of run, the kicker's natural foot, even the flavour of ball being kicked. And the numbers allow you to give the keeper a slight advantage compared to his previous spot kick saving talent levels.

You pass the information onto just one of the keepers because the club has decided to sell the other (now slightly inferior) goalie for a hefty profit. Just to be sure, you trial the keepers over a million penalty kicks (it is a thought experiment, remember) and sure enough, keeper A saves kicks at a rate of 25% and keeper B is stuck at the league average of 22%, (we could have used our proprietary tool to strip out random variation, but I made that bit up).

Prior to this analytical tweaking, your robotic, talent flat lining keepers and their 22% ability to save penalties had faced 43 kicks spread over eight seasons and they had prevented 8 from entering the net. So they'd been a bit unlucky and saved 19%. Even for a true 22% keeper, you are going to save 8 or fewer about 34% of the time.

Keeper B leaves and keeper A, his talent constant, remains the starter for the next two seasons. Buoyed by the knowledge that their keeper can save 25% of the penalties compared to just 22% previously, the defence thinks nothing of fouling opponents in the box and they concede 17 penalties in two seasons....and all but two are scored.

Just 12% stay out of the net! It's a disaster.

Meanwhile, keeper B, by a happy coincidence, also faces 17 penalties and saves 4, nearly a 24% success rate. Inevitably, keeper B is repurchased for a lot more than he was sold and the analytics department closes to finance a couple of days worth of his now considerable salary.

So what went wrong?

The answer is probably nothing. Inferior teams out perform superior ones all the time over limited numbers of trials. The World Series springs to mind. If you pair a keeper who has a 25% chance of saving a kick with one that has an inferior 22% chance over 17 trials, the better keeper will save more kicks on just over half of the occasions. About 17% of the time they will save the same number. But in a not insignificant 32% of the paired trials, keeper B will own the bragging rights.

17% of the time keeper A will save 2 or fewer penalties and 40% of the time your new improved keeper won't beat the raw success rate of 8 from 43 posted over the previous eight seasons by the combined individual might of two 22% penalty deniers.

You've succeeded in your brief to improve a player, but that improvement isn't guaranteed to show itself in 17 repetitions. You still need to get a bit lucky to become the hero.

Moneyball, the movie played around with the chronology of some of the trades for dramatic effect, (once again stats take a back seat to the demands of narrative), but I'm assuming the sluggish start to the 2002 Oakland A's season, when an analytical approach was in full swing and the results on the field were not, is more faithful to the actual course of events.

Chance, might not always love a trier.

Sunday 9 March 2014

Jonathan Walters Penalty Watch.

Stoke's Jonathan Walters could probably do with a bit of a lift this morning. Fresh from being branded the poster boy for terrible Premiership penalty takers following his two own goals and penalty failure against Chelsea and an opening day miss at Liverpool, he has just converted two spot kicks in a week.

The first, low to the keeper's *** defeated Arsenal and yesterday, high to the keeper's ***, he earned a point at Norwich. Particularly satisfying for a former Ipswich player. Unfortunately, he then fell into the on going Stoke narrative by getting sent off with a straight red card five minutes later, without even being able to plead lack of intent to a sympathetic official. Shrugs, rather than smiles all round, unlike at the Hawthorns in the early kick off.

There are bigger bogeymen in Stoke's current team for Walters to jump to the head the hatchetman hate list, but he still merits a page one entry on Google when you search for the Premiership's worst penalty taker. The seeds were sown by Eurosport's use of small sample sized efficiency rates to compile a list of penalty taking shame. A route also trodden by the Telegraph following the Irish international's mishaps against Chelsea. The stats, apparently don't lie, although they do invite writers to make a cast iron case based around numbers that are awash with randomness.

Adam & Walters. a godsend for lazy journalism.
In this post I looked at how often a player with a league average conversion rate of around 78% would score from 8 or fewer goals out of 13 penalty kicks, Walters' record after the Mignolet save. If you lined up entirely average Premiership kickers and keepers, around one in ten average penalty takers would post a record at least as bad as Walters had up to 3 o'clock on opening day after 13 tries.

So rather than "the stats not lying" as the unnamed Telegraph hack suggested, they can only really be used to couch opinion in terms of probabilities and likelihood. Outcomes are a product of both randomness and talent and not simply the latter. (note to the Telegraph, Walters isn't a defender, either, although this is a more forgivable mistake to make if your were only a casual observer of Stoke under Pulis).

Two further penalties down the line, Walters now has a 10/15 (67%) record or 12/17 (71%) if you include one FA Cup penalty and one League Cup shootout success as a Stoke player. If you now simulate an average penalty taker attempting 15 penalties, we can see how often such a player would score with 10 or fewer  of his kicks.

There is now very nearly a 25% chance that an average penalty taker will fare as well or worse than the 10/15 record currently owned by Jon Walters and now if we line up our parade of average shooters around one in four will only just make it into double figures from 15 attempts.

Efficiency figures or success rates are a perfectly acceptable way of expressing what has actually happened, such as when a team is out shot, but sneaks the only goal of the game with one of their few efforts. But it becomes a step too far when, these limited pieces of information are used to support entertaining, but fundamentally flawed narrative.

Walters is a professional footballer, playing in one of the top leagues, as well as being a full and experienced international player, (for whom he's scored a spot kick). So it is likely that his penalty taking talent is going to lie around the fat, middle of the distribution, despite any needs of a quick fix narrative based on barely a dozen or so trials.

He's most probably neither a king or a pauper.

Tuesday 4 March 2014

A Football Match That Lasts For Two Days. Ashbourne Shrovetide.

Firstly, a warning. There are no stats in this post. It is merely a plug for the Royal Shrovetide Football match played over two consecutive days in Ashbourne from 2pm to possibly 10 pm from today, Tuesday the 4th.

Although the event is described as football, the term is only really applicable to the aims of the contest. The execution owes much more to modern day rugby than the Beautiful Game as practiced by Barcelona or even Stoke.

The game takes place in Ashbourne, just on the wrong side of the Staffordshire/Derbyshire border, on a pitch that comprises of most of the town's public areas, where the goals are situated three miles apart, each in a river. The two teams, the Up'ards or the Down'ards are decided by which side of the Henmore brooke a player was born or lives.

The ball is thrown or "turned up" to the mass of players, by a local worthy or occasionally a member of the royal family from a plinth in the Shaw Croft car park. The ball then usually disappears from view for most of the match, consumed in a "hug" that more resembles a rugby maul and with no instruction to "use it or lose it" progress can be extremely slow as each side attempts to wrestle the ball towards their intended goal.

Despite the longevity of the duration of the game, goals are relatively rare, so much so that any scorer gets to keep the ball and another one may be turned up. Much of the action takes place under cover of darkness and inside the hug, while the "runners" are ready to make good any opportunity provided by the engine room of each side.

Spectators can get as close to the action as they wish, but should the ball pop out they'd be well advised to run!    

Wrong place, wrong time.

Groundsman in despair.

The goal's a mile that way.
There's possibly live feed from 2 pm at   or local internet radio sometimes cover the game, if they can find the ball

2014 game won 2-0 by the Up'ards.

Calculate Your Own Goal Expectation For Shots or Headers.

The rise of the shot model has led to a variety of ways to calculate goal expectation for shots and headers taken from different areas of the pitch. The inputs that go into the calculations can also be many and varied, ranging from shot placement and power to whether they chance originated from open play or a set piece.  But the prime drivers for the likelihood that a goal is scored remains how far and wide from the goal the attempt is taken and if the attempt was a shot or a header.

In this guest post   I've broken down the basic method used throughout this blog into a couple of relatively easy steps to allow anyone to be able to create average goal probabilities for any shot or header attempted from anywhere on the pitch.