Saturday, 23 March 2013

Nonlinearity and racehorse ratings

Last time, I discussed some of the confounding issues associated with nonlinearity in sports modelling. As I realise that many readers have an exclusively racing interest, I thought it suitable to present an example of nonlinearity in the domain of handicap ratings.

When we talk about a 'rating' we need to be sure what it actually represents. Is it the best performance a horse has achieved? The best performance within reasonable expectations of its other ones? Or the figure we expect it to achieve next time it runs? Or some other measure, predictive or retrodictive? A projection or an observation?

Some horserace handicappers (the term I will use here to refer to those involved with the pseudoscientific manipulation of performance-based numbers towards the appearance of objective truth) tend to assume their numbers can serve both purposes.

A great example of this confusion was evinced by the discussion about the chances of the six-year-old gelding My Tent Or Yours before the Supreme Novices' Hurdle at the recent Cheltenham Festival. Referring to the talented hedgehopper having earned a lofty rating with an easy victory in the Betfair Hurdle at Newbury, it was observed in at least one place that "My Tent Or Yours had earned a rating good enough to win every Supreme Novices since Golden Cygnet in 1978."

The axiom that My Tent Or Yours had achieved a rating well in excess of the par for the Supreme Novices' is valid, but assuming that this can be used as a projection in any context is a very poor misapprehension. I don't say this because the horse was beaten (an example of post hoc analysis) but because it was no justification for any kind of probabilistic assessment of his chance. His starting price of 15-8 favourite was probably about right, in my view, but, were he actually a horse already good enough to win the last 35 renewals of a race (rather than simply being rated as such), he should be starting a whole lot shorter.

I am sure you are aware why the use of the Betfair Hurdle rating of My Tent Or Yours in this way constitutes a logical fallacy. In technical terms, it is because opportunity cost has not been discounted from it. In other words, had every horse who lined up for the Supreme Novices' since 1978 had the chance to run off 149 in a handicap, the distribution of prior ratings would have looked completely different. We might have then have reasonably observed something like: "This My Tent Or Yours has achieved a rating which should make him highly competitive in the vast majority of likely outcomes of this race, and there is a very good chance that he should win about one in four renewals within one standard deviation of the mean in terms of quality."

The example of My Tent Or Yours is an insight that ratings and their application presents a nonlinear problem - albeit one which can easily be solved by Monte Carlo simulation. By running in a race against lots of horses who themselves had been given the chance to expose something closer to the full extent of their ability, My Tent Or Yours had, in effect, increased his rating but lowered his upside. You should recognise this type of expression as belonging to the same family as the paradox of offensive rebounding-v-transition defense in basketball to which I referred yesterday; it represents a nonlinear trade-off between negatively correlated measures. In this case, the more chances a horse has to improve his rating, the less the potential he has for improvement.

The trouble with most horserace handicappers (and I strongly exclude Simon Rowlands from this bracket) is that they make reasonable sound assessments of the past but lousy projections about the future. And one reason for this is that they tend not to discount opportunity cost in their estimations; they give far too much weight to achievement in the absence of the context provided by opportunity/potential. Many times, the edge that they perceive a horse possesses according to their generalised linear observation about the past is swallowed up by the nonlinear randomness of the future.

To provide a working example of the importance of upside in making projections from ratings, I am now going to provide 50,000 simulations of a five-horse race. The ratings I am using go beyond the usual framework of such as 'best recent effort' or 'best of last six runs' or 'best of last year' or whatever. But they represent the centre of a typical older Flat horse's distribution of Beyer Speed Figures which I have on a massive database for non-commercial use. (I first used a statistical technique called Maximum Likelihood Estimate to calculate a mean and standard deviation from the data, then employed further analysis to produce the shape of the expected distribution of its ratings (not surprisingly heavily left-skewed as I will show graphically later in the series) next time out. In other words, I looked at how past and future performance are related for older horses in the US within a nonlinear framework.

First, a look at our five horses: Anchovy, Butler, Coconut, Donkey and Einstein.

Anchovy 90
Butler 89
Coconut 88
Donkey 87
Einstein 86

The figures represent the expected performance level at the centre of their left-skewed distributions (more of that in a second). In other words, the rating is the median figure of all those they can be expected to run to next time out (given ideal conditions, a vacuum and Mickael Barzalona not within 3,000 miles etc etc). Every horse is the same robot and a point is worth half a length per mile, similar to the scale with which most handicappers are familiar.

In simulating a race between them, I am making the assumption for now that their performances are independent events - which they certainly are not (as we will come on to).

Anyway, 50,000 races between them produces the following results:

Horse
Rating
W
L
%
Anchovy
90
16307
33698
32.6
Butler
89
12536
37469
25.1
Coconut
88
9553
40452
19.1
Donkey
87
6740
43265
13.5
Einstein
86
4869
45136
9.7

with the following placings:

Horse
1
2
3
4
5
Anchovy
16422
10436
8082
7814
7251
Butler
12616
11213
9231
8751
8194
Coconut
9333
10643
10347
9946
9736
Donkey
6888
9582
10996
11218
11321
Einstein
4746
8131
11349
12276
13503

producing the following statistical parameters:
 
Horse
AVG
STD
MAX
MIN
MED
Anchovy
87.4
7.92
108
45
90
Butler
86.4
7.93
106
44
89
Coconut
85.4
7.87
106
42
88
Donkey
84.4
7.98
105
39
87
Einstein
83.3
8.00
106
42
86
 
Lovely job! The top-rated wins 32.6% of the time, a 2-1 shot even with such a seemingly narrow edge. But life is seldom so clear-cut, and races don't take place between horses who have the same upside. So, what happens in theses cases? What if Anchovy is like My Tent Or Yours and has already had more opportunities to achieve his best figure, so that his upside is limited? And, what if Einstein in instead a lightly-raced horse? Is the top-rated still the favourite? Find out next time.