Lies, Dammed Lies and Statistics

As cricketers we are obsessed with statistics both personal and team related. However the one statistic which dominates conversations the length and breadth of the country is that of the batting average.

I have witnessed first hand the skullduggery that goes hand in hand with the club players batting average. The lengths that usually honest and sane people will go to in order to bump up “the average” is at times mind boggling.

There is the classic “round up” method adding on a run or two to reach the nearest whole number, this method is usually qualified with the “it makes the total runs easier to add together” excuse.

Then there is the “run out” guy. The guilty party will, when calculating his average, often omit the instances when they were run out, with the excuse “I wasn’t out in a proper way”. No no no, my friend, you were out. Admittedly in an unfortunate way but you were still out.

The most blatant and infuriating method is one which plays out during a “set” batsman's innings. I played in a game a number of years ago, where I was facing the second last ball of an innings. We were batting first and were looking to finish up with an imposing total. The non- striker was the “set” batsman who had recently reached a fifty.

Wanting to get him back on strike for the last ball, I pushed the ball to just forward of point and called for the comfortable single. To my amazement, I was sent back with a loud, confident bellow of “NO!”. The last ball passed me in a bit of daze, I started to swing for it as it entered the keepers gloves. As I caught up with my partner (he was half way to the pavilion and seemed to be prepping himself for the applause of the fans) I enquired politely “What was that about?”. “Just looking after the average” he replied with a shrug of his shoulders and a confident stride towards the pavilion.

When cricketers analyze a set of scores we often talk about finding our average. When in reality we are only looking at one expression of the average, the MEAN. The mean being the arithmetic average of a distribution. However, the mode and median can also be used to help predict the data we have. Unfortunately, we often ignore those other elements in our quest to find an appropriate measure of central tendency. Is the mean appropriate for cricket averages?

The mean has one main disadvantage: it is particularly susceptible to the influence of outliers. Outliers affect the distribution because they are extreme scores. Look at the below table. (Taken from Laerd Statistics website)

Staff 1 2 3 4 5 6 7 8 9 10

Salary 15k 18k 16k 14k 15k 15k 12k 17k 90k 95k

The mean salary for these ten staff is 30.7k. However, inspecting the raw data suggests that this mean value might not be the best way to accurately reflect the typical salary of a worker, as most workers have salaries in the $12k to 18k range. The mean is being skewed by the two large salaries.

Now using the same set of numbers but this time applying the median, which is the middle score for a set of data that has been arranged in order of magnitude.

12 14 15 15 15 16 17 18 90 95 In this case we have 10 scores. We simply have to take the middle two scores and average the result. The median is the average of 15 and 16. So we have a median score of 15.5.

Now lets consider using the mode. The mode is the most frequent score in our data set. So for this example the mode is 15. Now cast your eye back to the table and replace the “K” with “runs”. Look familiar?

Of course Cricketers are not mathematicians, working out our average using the mean is simple and maybe that’s why it has stuck. The argument also being that the number of runs a player scores and how often they get out are two important elements of batting and therefore a good measure of a batsman skill.

The flaw in using the mean is that one score (good or bad) can skew the results and give a false impression of a batsman's overall season.

Over the winter I have been perusing the Statszone sections on CricketEurope. The thought occurred to me, How could we get a more accurate measure of a players ability? I particularly liked the concept of combining the batting average and strike rate for a batsman (the higher the better) and bowling average and strike rate for a bowler (the lower the better).

I thought for a second that perhaps I had revolutionized how we look at cricket players statistics. I could see my name in lights, a book, no a movie.. “McGinleys Moneyball” (with Tom Cruise playing my role naturally)

However, on closer inspection it appeared that I had been beaten to it. Mike Hussey of the Sydney Thunder Big Bash team has spoken on how they selected players on that very basis. For a batsman if their you added their average and strike rate together and got 150 then that would be exceptional for T20 game. Similarly for a bowler if you added their strike rate and average and got under 50 then that again would be pretty exceptional.

I guess that no matter what statistical method you use to evaluate or pick players, success is not a guarantee, even the Oakland As baseball team who inspired the movie “Moneyball” did not win the world series.

Perhaps the wit of former Notts Forest manager Brian Clough illustrates the point best, “They look stronger on paper” was a question once asked of Clough, “Aye they do, fortunately we are not playing them on paper.”