Lies, Damn Lies, and Historical Match-Ups

ports statistics are a hypnotic tool, they provide a quick, satisfying certainty to any fan looking for empirical guidance.  But their validity is seldom questioned by pundits or supporters, and perhaps the most meaningless statistic is the oft cited historical match-up statistic.  In a preview it will inevitably be recited that Team A has not beaten Team B in X number of previous games, that Team B has a history of upsets in the particular match-up, or that one of them has not won the meeting since men’s hats were in high fashion.  The emphasis on history has the appeal of being an analysis born of tradition and experience rather than reason.  The mystique is comforting to the footballing community, not least because it ensures the continued employment of their wisdom, but because it reaffirms the bias that football is not a thinking man’s game.  We are thinking men, however, and it is time to take a look at the conventional wisdom.

Imagine a  giant mechanical statue of Sepp Blatter that has in its hand two dice used for determining football results.  The two dice each represent one side of a match-up in a game of chance.  The statue rolls the dice and whichever team/die shows the highest number wins the match.  In this model every team has an equal chance of winning as the dice are identical and have equal probabilities.  Played an infinite amount of times each team will have won and lost the same number of times as the other in the match-up.  If we know that the dice are identical we can say with certainty that the average outcome of the rolls will be an even split, but we do not know what will happen in the next game or the one after that.  We can only predict the trend, not the single result.  Assuming that Sepp rolls consistently of course, and he does.

The same game of dice can be extended to simulate the much more realistic situation of teams having different probabilities of winning.  Instead of being identical each die now has an unknown probability of winning.  The unknown value represents the combination of many variables that contribute to a team’s probability of winning, for example manager skill, player quality, morale, tactics, etc.  Each time the match-up is played information about the team’s probability of winning is revealed and our ability to predict results with less error grows.  Under these circumstances history is very indicative of the future and the more of it there is to observe the better the prediction.

But, if the teams are modified in between each play the odds and the results are changed.  A shakeup of the odds happens every summer when teams buy or sell players, change managers, enter new competitions or any of a number of actions.  If you change to a new set of dice what is the point of using information that only applies to the old set?

If it  seems abstract consider the usefulness of the historical match-up statistic for Leeds United going into the 2003-2004 season.  Historically the club had a decent record against all the sides it was to face, but that summer the team was decimated by a fire-sale of talent which left it unrecognizable from the side of a few years earlier.  Estimating Leeds’ chances of winning on historical performance would be foolish at best, and any bookmaker basing his odds on such grounds would probably be quietly dumped into the local river.  Leeds is a unique example only in scale, other clubs engage in the same thing every summer (and sometimes within a season), in a less dramatic fashion.  History is indicative of the future only if the key variables have not changed, and as teams are in constant flux match-up history means little.

Explaining the absurdity of a statistic is much easier than removing it from football commentary.  Usage of match-up history is understandable to a degree as it is easier to say “they won a lot before, they will win more now” than explaining some arcane method of calculating win chances.  Pundits would run screaming if they had to launch into a discussion of probability in the middle of a broadcast, and frankly some audiences would as well.   The issue is not simply that statistics generated with a funny understanding of maths is used as a placeholder for a more complete explanation, but that a more complete explanation does not exist at all because the placeholder is unquestioned.  Football discussion with a basis  in rational thought  is possible, and it is possible without damaging any of the animal spirits which feed the game.  After all, the pursuit of knowledge turns up its fair share of beauty.

Lies, Damn Lies, and Historical Match-Ups

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s