Wednesday, 9 March 2011

Lies, damn lies and the Cold List

Trainers who have endured a losing sequence are sometimes affronted by their inclusion on the so-called Cold List. As men and women running a business, they are understandably keen to avoid any damage to their reputation.

Clearly, it is rare that the analyst means anything personal by his observation - even if he or she clearly doesn't have a clue about the proper use of statistics.

Heck, sometimes the critic is entirely justified that the yard is not firing at present.

It happens to the best of all handlers - except Mark Johnston, of course.

But, statistics can be dangerous weapons in the hands of those who don't have sufficient mathematical skills.

The problem often starts with the metric employed to judge the criterion of "form". And this brings us to the dreaded Cold List.

Let's consider a trainer with a yard of mostly moderate stock. The majority of his hapless beasts are protagonists in the many low-grade handicaps in this great game of ours.

Every week, these noble steeds do battle with factors like randomness, the considerable intellect of certain BHA handicappers and the invidious effects of the draw.

All things considered, it is by no means to a trainer's discredit if the yard is ticking along at a strike-rate of 8%.

During the 2010 Flat season, trainers such as James Given, Jim Goldie, John Best, John Jenkins and Alan McCabe achieved this feat. There are some really capable names on this list, not the least of which is the excellent McCabe.

Now, let's take a sample of 20 runners which could have been sent out by one of these yards. In this hypothetical example, all of the horses are in identical physical condition.

In other words, the yard is form 'proof' so that its results are merely the result of random variation. (For the initiated, the 'trial' represented by each runner is independent of the results of the others.)

We can use the binomial expansion to calculate the probability - and corresponding odds - of the sample containing a certain number of winners. Here are the results:

0                18.9%             4.3 - 1
1                32.8%                2 - 1
2                27.1%             2.7 - 1
3                14.1%             6.1 - 1
4                  5.2%           18.2 - 1
5                  1.5%              67 - 1
6+                0.4%            262 - 1

If we combine the probability of 0 winners and 1 winner in the 20-runner sample, we get the following interesting result:

P (0 winnners) + P (1 winner)  =  .189   +  .328
                                                 =  .517   or    51.7%  or    0.93 - 1

In other words, in any 20-runner sequence of a trainer with an 8% strike-rate, it is odds on - more likely than not - that he or she will have a maximum of one winner.

Think about this for a second. If you were about to have a serious bet on one of James Given's runners, for instance, how might you react if someone said: "Stop! While I am a huge fan of that particularly talented trainer, I have noticed he has had only 1 winner from his last 20 runners."?

Would you move on regardless? Brush off the comment because you know of its insignificance?

Then, good for you. But there are plenty - including some who earn a living from analysing racing - who would be concerned, perhaps even save their money for another day. 

More specific to the infamous Cold List, it is less than 9-2 (18.9% probability) that our trainer will have 20 losers on the trot. So, to draw the inference that he or she is out of form - that future runners from the stable are likely to perform below expectations - is not just unfair to the trainer but may cost you money as a punter.

At this point, you might be saying: "Come on. I would never say that a trainer is out of form after just 20 losers."

Good for you, then. But I guarantee there are similar, if not worse, claims made with racing statistics every day. And the most disingenuous presentation is when the operator attaches "..for what it is worth" as a disclaimer, as if he or she has any idea of statistical significance.

Trainers who sometimes get annoyed when they are identified as being "out of form" may have justification on their side.

And, with regard to racing statistics in general, ask yourself these questions next time someone attempts to blind you with stat-geek pseudoscience:

Have they taken sample-size into consideration?
Do they understand how to test for statistical significance?
Are there biases to their sample in the first place?

Yes, there are some good operators who use statistics in the media. Timeform's Simon Rowlands writes superb articles on their site, while Hugh Taylor of At The Races deserves particular praise for his flexible and intuitive use of the numbers. And I have mentioned advanced websites like and in other postings.

But don't let me come across as an intellectual snob. Even if they make strictly the odd mistake in interpreting a forest of statistics, some operators can easily be forgiven because of their robust approach overall.

Racing statistics do provide entertainment to a huge number of people. You only have to look at the deserved popularity of products like those of Paul Jones and Mark Howard.

Such is the enthusiasm with which these guides provide statistics - and indeed highlight good winners -  that they increase interest in the sport and provide their followers with the feeling they are employing a sophisticated approach. That can only be good for the game.

But, back to the Cold List. If you read this article and your reaction was: "Wait a minute, Willoughby you muppet. Winners-to-runners - nobody seriously uses that anymore, do they?

"Surely it went out with people who think front running is 'doing it the hard way'. We, the enlightened, use other statistical measures - like the average distance a stable's runners were beaten, their run-to-form percentage, or the impact value of a good-sized sample of their recent runners.

"Surely everyone knows there is far more inherent significance in these metrics?"

Then, give yourself a pat on the back. Because you, sir, know your stuff.