Thursday, 6 September 2012

Statistical inference and mathematical modelling (Part 6)

6.1 Staking and confidence

In this final part of the series - I can hear distant cheering! - I will be answering the question of how bets should be optimally staked.

In the last part, I produced the results of a mathematical model of an NFL team's expected total interceptions for the 2012 season (xINT). I applied these to pricing from existing betting markets to derive the expected value (EV) I might notionally achieve - if my projections were accurate on average - on each proposition.

The values ranged from a healthy 30.5% profit on betting New Orleans OVER 12.5 to an expected loss of 6.5% on either San Diego UNDER 12.5 or San Diego OVER 12.5. (On the last point, it doesn't matter which bet is struck because my projection coincided with that of the bookmakers.)

It might help if you checked back on the table to get a feel for the range of profit margins I am expecting. If you are a sports bettor with any concept of what to expect from your investment - ie all of you who are taking it seriously - then you might be shocked by how large some of these profits are projected to be. But this is actually just routine.

If you are a punter who makes, say, a 10% profit on investment, the actual profit margin on individual bets (should you be able to know them) would be distributed either side of this 10%. Some of your wagers would have been sucker-bets that would have seen you make a crushing loss over time, while some of your other pokes would have been acts of genius. It is very difficult to know the profit margin on each bet, but much easier to discern the average from your bottom line.

It should be the same when you make projections mathematically or intuitively. Though I have a very strong conviction indeed that the total interceptions market is highly influenced by irrationality - in this case, the degree to which the important variable is a function of randomness or, at least, factors which a team finds extremely difficult to sustain - I don't know for sure that the New Orleans Saints are a lock to intercept 13 or more passes. But the point is that my model has the same degree of confidence that the total will go under as it does about all the other 31 total interception scores of NFL teams.

"Wait a minute", I hope you are thinking, "what about the fact that your projection about the Saints - and corresponding EV on betting the over - is an outlier to the market. Surely that should make you think twice about having the same confidence in the selection as you do with your San Diego projection, which agrees with Vegas?"

6.2 Efficient markets

The answer involves understanding the type of market that you are investing in. In examining the past results of the total interceptions market, I see no evidence that it qualifies as a semi-strong (or, of course strong) market. It is not an exchange market with high volatility, distilling the wisdom of the crowd and expressing the value of all current information known about the teams' capacity to make interceptions. No, it is a novelty market with mispriced commodities; that's why I picked it.

It should be a lot easier to beat (that is different to saying it is easy to beat) than, say, a Betfair market on a Premiership football game at 2.59pm on a Saturday afternoon. (Although, I am given to understand that plenty of you actually beat those well enough, clever chaps...)

The Efficient Market Hypothesis - and the many subsequent academic reactions and contradictions to it - can wait for another blog. But at least you should have a qualitative understanding as a bettor that you have absolutely, literally zero chance of making a 30.5% profit in the long run (aha! this nebulous creature is the subject of another future blog in itself) on all your investments in sports betting. And if many of the projections your math models produce suggest as much, you need to think again. (Or, tell me on the quiet...)

6.3 The Kelly criterion

Underneath is a table of all propositions in the total interceptions market which have an Expected Value (EV) > 0. Also given is the percentage chance of success given by my model. These were used in Part Five to produce the EV calculations, but I did not have room for another column in my table. Also included is the all-important market price, expressed as odds to one.

Team Selection Win % Odds-1
New Orleans Saints OVER 12.5 69.8 % 0.87
New England Patriots UNDER 22.0 69.5 % 0.85
Chicago Bears UNDER 20.0 66.8 % 0.85
Baltimore Ravens OVER 16.5 67.4 % 0.83
Philadelphia Eagles UNDER 18.5 64.7 % 0.87
Dallas Cowboys UNDER 18.0 64.1 % 0.85
Arizona Cardinals OVER 14.0 63.4 % 0.87
Tennessee Titans OVER 14.0 62.8 % 0.87
Denver Broncos OVER 13.5 61.4 % 0.87
Buffalo Bills UNDER 18.5 62.0 % 0.83
Oakland Raiders OVER 15.5 60.5 % 0.87
New York Jets UNDER 17.5 60.9 % 0.85
Kansas City Chiefs UNDER 18.5 59.9 % 0.87
Indianapolis Colts OVER 11.5 59.2 % 0.87
Jacksonville Jaguars OVER 15.0 60.1 % 0.83
New York Giants UNDER 18.5 58.6 % 0.85
Carolina Panthers UNDER 17.5 59.0 % 0.83
Detroit Lions UNDER 18.0 58.1 % 0.85
Houston Texans OVER 16.5 57.0 % 0.87
Cleveland Browns OVER 13.5 58.1 % 0.83
Green Bay Packers UNDER 24.5 58.9 % 0.80
Miami Dolphins UNDER 16.0 56.7 % 0.85
Minnesota Vikings OVER 11.5 57.3 % 0.83
San Francisco '49ers UNDER 20.5 55.8 % 0.85
Atlanta Falcons UNDER 19.0 55.0 % 0.85
Washington Redskins OVER 14.5 54.2 % 0.87

So, those are my selections but, assuming I have £1000 to speculate in the market, the question is how much is optimal to place on each proposition?

In the course of our betting career - which it is easiest to assume for mathematical purposes is infinitely long - we have conflicting ambitions when making a wager. (Actually, in the grand scheme of things we may have many ambitions, but I mean from the standpoint of growing our betting bank.)

On the one hand, we want to maximise our profit. On the other, we want to protect against the ghastly prospect of going busto, referred to technically by one meaning of the term Gambler's Ruin.

Incidentally, as economists among you will point out, we should not think of our money in terms of pounds and pence but, better, in terms of utility - its importance to us or what it can buy. A coin-flip for £25,314 between me and the excellent and very rich Noel Edmonds have the same EV for both of us in terms of cash (Expected Value = zero) but very different prospective outcomes when measured by utility (Expected utility = massively negative for me, zero for him, because of the change in utility of losing for me is now huge as I do not receive an overblown retainer to do Racing UK.)

                                  Edmonds: TFNL hero owns enormous utility buffer

But let's make the assumption - extremely rash, I know - that we all have £1000 on the sideboard to speculate on my total interceptions market with a sustainable impact on utility considerations. How do we do it? Can I ever get to the point?

This is where the Kelly criterion comes in. I expect you might have heard it bandied around, especially if you are in the poker community, given the ubiquitous love among card players of tossing around concepts which they probably then fail to apply to their game. (I'm sure we have all found it a massive relief to rid yourself of the dreaded possibility of polarising your range, no?)

Take a deep breath. If p=the odds-to-one of success at which you are betting (listed in the right-hand column of the table) and x=the calculated expected chance of success, then the optimal fraction of your betting bank, K for Kelly percent, which to stake in order to marry profitability with defence against ruin is given by the formula:

K = [ x (p + 1) - 1 ] / p

So, for instance, if you have a 50% chance of winning, as in a fair coin toss with Noel Edmonds, then x =0.5. If he generously gives you 2-1 about winning - for the sake of one of his tremendous television shows - then p = 2.0, so the Kelly criterion advocates:

K = [ 0.5 * (2 +1) - 1 ] / 2

    = 0.25 or 25% of your bankroll.

Not 67 times your bankroll, as contestants on Deal Or No Deal do regularly when confronted with a parallel situation, often stating the justification "you are only here once" as if there is no other form of legalised gambling involving pure chance open to them outside the studio. (Edmonds is an evil genius; he has to be. Surely the banker has pointed out the inconvenient truth in a production meeting. Indeed, the only here once caveat is a tremendously useful, yet illogical, device for all of mankind, as in: "How can you do this to me? And with the milkman as well, surely you must have realised it was extremely likely you would destroy our marriage?"...."Well, that's true love, but you are only here once...")

Back on point. Here are the Kelly percentages K% for all 26 total interception propositions which are +EV. (If you learn nothing else from this blog, just try saying: "I thought it was super +EV, I mean 'super'," every time another gambler asks you about a losing bet. You'll end up as a paid expert at best or, at the very least, revered by other pseuds. Super is a key word nowadays):

Team Selection Win % Odds-1 K%
New Orleans Saints OVER 12.5 69.8 % 0.87 35.1
New England Patriots UNDER 22.0 69.5 % 0.85 33.6
Chicago Bears UNDER 20.0 66.8 % 0.85 27.7
Baltimore Ravens OVER 16.5 67.4 % 0.83 28.1
Philadelphia Eagles UNDER 18.5 64.7 % 0.87 24.1
Dallas Cowboys UNDER 18.0 64.1 % 0.85 21.9
Arizona Cardinals OVER 14.0 63.4 % 0.87 21.3
Tennessee Titans OVER 14.0 62.8 % 0.87 20.0
Denver Broncos OVER 13.5 61.4 % 0.87 17.0
Buffalo Bills UNDER 18.5 62.0 % 0.83 16.2
Oakland Raiders OVER 15.5 60.5 % 0.87 15.1
New York Jets UNDER 17.5 60.9 % 0.85 14.9
Kansas City Chiefs UNDER 18.5 59.9 % 0.87 13.8
Indianapolis Colts OVER 11.5 59.2 % 0.87 12.3
Jacksonville Jaguars OVER 15.0 60.1 % 0.83 12.0
New York Giants UNDER 18.5 58.6 % 0.85 9.9
Carolina Panthers UNDER 17.5 59.0 % 0.83 9.6
Detroit Lions UNDER 18.0 58.1 % 0.85 8.8
Houston Texans OVER 16.5 57.0 % 0.87 7.6
Cleveland Browns OVER 13.5 58.1 % 0.83 7.6
Green Bay Packers UNDER 24.5 58.9 % 0.80 7.5
Miami Dolphins UNDER 16.0 56.7 % 0.85 5.8
Minnesota Vikings OVER 11.5 57.3 % 0.83 5.9
San Francisco '49ers UNDER 20.5 55.8 % 0.85 3.8
Atlanta Falcons UNDER 19.0 55.0 % 0.85 2.1
Washington Redskins OVER 14.5 54.2 % 0.87 1.6

So, K% gives you the optimal percentage of your money to invest in each proposition in order to grow your bankroll optimally. For the academically minded among you, here are several properties of Kelly staking worth bearing in mind:

1) Kelly staking assumes that all bets are independent and sequential. In this case, total interceptions for each team are indeed independent (or extremely close to it) but they are not sequential. We have to place all our wagers on this market before the season, before the outcome of each bet is known, so K% for all propositions may - and indeed does - add up to more than 100% of our bank. In this case, the sum of all K% is 383.4%.

2) Non-independent propositions, such as the various horses in a race, give rise to a more complicated situation, but one that can also be dealt with optimally. However, that is beyond the scope of this article and subject of another one coming up.

3) In practice, Kelly staking feels rather risk-seeking. Would you really be comfortable betting 35.1% of all your monies available for betting on a single proposition like New Orleans OVER 12.5. It was a cracking bet, in my view, but this percentage sounds - and feels - like too big a risk.

Indeed, there are more than just psychological reasons to be wary of using a full-Kelly approach. As we are talking projections and probabilities, there is the danger of a misapprehension in the calculations. There may be other variables in play which we cannot account for, or a change in preconditions. In the case of total interceptions, for instance, we can't be sure of the effect of replacement referees in the early weeks of the NFL season, or how the ever-increasing rate at which passes are completed will affect the relationships between passes defended and total interceptions in our correlations. In other words, in any dynamic system of variables there is considerable chaos which argues for a riskaverse attitude.

So, the remedy for many investors is to use a scaled percentage of Kelly, such as two-thirds Kelly or even half-Kelly. In effect, this mitigates risk, but it also militates against profitability. But, as relatively riskaverse investors, we have to accept and live with the paradox.

4) Next, though this probably does not affect our calculations here, in semi-strong form (and strong) market situations, the price of a proposition is a variable in and of itself. This is a point I used to belabour tirelessly (and tiresomely for the viewers) on Racing UK - even though one or two of my colleagues still do not accept that horserace betting markets are approaching a semi-strong state.

So, even though all the known information about a horse remains constant, the fact that it is drifting in an exchange market should affect your view of its percentage chance of winning somewhat. This is not suspicion or superstition but can be seen as a result of the obvious paradox, recently (2008) named by the American blackjack genius and mathematician Edward Thorp as Proebsting's Paradox.

Say if you backed a horse with a Kelly percent of your bankroll at 5-1 and it drifted to 10-1. Now, what do you do? Invest a higher percent again because the value has increased substantially? Do that and you will go broke very quickly.

6.31 The importance of bet-sizing

In this case, our profit margin will be the sum of the expected sum of all wagers. Having scaled our bets using Kelly, keeping the ratio between all individual wagers the same as if they were staked sequentially, so that the revised total K% = 100%, we are left with:

Team Selection Win % Odds-1 K%
New Orleans Saints OVER 12.5 69.8 % 0.87 9.2
New England Patriots UNDER 22.0 69.5 % 0.85 8.8
Chicago Bears UNDER 20.0 66.8 % 0.85 7.2
Baltimore Ravens OVER 16.5 67.4 % 0.83 7.3
Philadelphia Eagles UNDER 18.5 64.7 % 0.87 6.3
Dallas Cowboys UNDER 18.0 64.1 % 0.85 5.7
Arizona Cardinals OVER 14.0 63.4 % 0.87 5.6
Tennessee Titans OVER 14.0 62.8 % 0.87 5.2
Denver Broncos OVER 13.5 61.4 % 0.87 4.4
Buffalo Bills UNDER 18.5 62.0 % 0.83 4.2
Oakland Raiders OVER 15.5 60.5 % 0.87 3.9
New York Jets UNDER 17.5 60.9 % 0.85 3.9
Kansas City Chiefs UNDER 18.5 59.9 % 0.87 3.6
Indianapolis Colts OVER 11.5 59.2 % 0.87 3.2
Jacksonville Jaguars OVER 15.0 60.1 % 0.83 3.1
New York Giants UNDER 18.5 58.6 % 0.85 2.6
Carolina Panthers UNDER 17.5 59.0 % 0.83 2.5
Detroit Lions UNDER 18.0 58.1 % 0.85 2.3
Houston Texans OVER 16.5 57.0 % 0.87 2.0
Cleveland Browns OVER 13.5 58.1 % 0.83 2.0
Green Bay Packers UNDER 24.5 58.9 % 0.80 2.0
Miami Dolphins UNDER 16.0 56.7 % 0.85 1.5
Minnesota Vikings OVER 11.5 57.3 % 0.83 1.5
San Francisco '49ers UNDER 20.5 55.8 % 0.85 1.0
Atlanta Falcons UNDER 19.0 55.0 % 0.85 0.5
Washington Redskins OVER 14.5 54.2 % 0.87 0.4

So, if our intended total investment is £1000, our portfolio incorporating all propositions with a positive Expected Value looks like this:

Team Selection Win % Odds-1 £ bet
New Orleans Saints OVER 12.5 69.8 % 0.87 91.52
New England Patriots UNDER 22.0 69.5 % 0.85 87.69
Chicago Bears UNDER 20.0 66.8 % 0.85 72.36
Baltimore Ravens OVER 16.5 67.4 % 0.83 73.36
Philadelphia Eagles UNDER 18.5 64.7 % 0.87 62.93
Dallas Cowboys UNDER 18.0 64.1 % 0.85 57.03
Arizona Cardinals OVER 14.0 63.4 % 0.87 55.64
Tennessee Titans OVER 14.0 62.8 % 0.87 52.28
Denver Broncos OVER 13.5 61.4 % 0.87 44.43
Buffalo Bills UNDER 18.5 62.0 % 0.83 42.30
Oakland Raiders OVER 15.5 60.5 % 0.87 39.38
New York Jets UNDER 17.5 60.9 % 0.85 38.87
Kansas City Chiefs UNDER 18.5 59.9 % 0.87 36.02
Indianapolis Colts OVER 11.5 59.2 % 0.87 32.09
Jacksonville Jaguars OVER 15.0 60.1 % 0.83 31.37
New York Giants UNDER 18.5 58.6 % 0.85 25.81
Carolina Panthers UNDER 17.5 59.0 % 0.83 25.05
Detroit Lions UNDER 18.0 58.1 % 0.85 22.97
Houston Texans OVER 16.5 57.0 % 0.87 19.76
Cleveland Browns OVER 13.5 58.1 % 0.83 19.87
Green Bay Packers UNDER 24.5 58.9 % 0.80 19.63
Miami Dolphins UNDER 16.0 56.7 % 0.85 15.02
Minnesota Vikings OVER 11.5 57.3 % 0.83 15.27
San Francisco '49ers UNDER 20.5 55.8 % 0.85 9.91
Atlanta Falcons UNDER 19.0 55.0 % 0.85 5.37
Washington Redskins OVER 14.5 54.2 % 0.87 4.06

Multiplying the Expected Value of each proposition by the stake and summing for all wagers, the total expected return for the entire portfolio is £1174.92, a profit of 17.49%. I'll report back after the season to see how close to the return is this target sum.

6.4 Introducing Modern Portfolio Theory considerations

One thing about that last sentence might strike you as disappointing. After all that, just 17.49%? But, in the table in Part 5, there were seven teams that yielded a higher Expected Value (EV) in the total interceptions market. So, why not just back them?

You probably know the answer to this intuitively. The consideration is diversifying risk. The more propositions with a positive EV we can add to the portfolio, the more likely that the result will reflect our underlying skill as an investor (which we have to assume or else why bother?) and the less likely it will reflect randomness.

So, there is a see-saw effect. Collect just the tastiest eggs from the henhouse and put them in one basket (I am thinking of my friend Graham Cunningham's superbly funny analogies at this stage) and we have a chance of a right feast but also a chance of no breakfast, but collect every old hairy hen's eggs and put them in loads of baskets and we definitely get breakfast. The problem is, it could be a Little Chef breakfast.

The consideration is more applicable in situations where portfolios in a financial market can be created with assets which are co-dependent or negatively correlated. In this case, we might want to select an equal number of OVERS and UNDERS to hedge against risk in case there is a dramatic change to the level of total interceptions, such as if all NFL teams had installed the no-huddle offense and games had 50% more plays and those plays were associated with a higher risk. It sounds unlikely, but it does happen. Again, this is a wide-ranging topic for another blog.

6.5 Conclusion

I really hope you have enjoyed the series. To be honest, I know it is challenging material but I have tried my very best to explain it in a straightforward fashion. I don't believe in dumbing things down, which is one of many reasons why I have found the media challenging. Perhaps the point is that I don't have the skills to express myself at an intermediate level.

However, I believe that people who are motivated to learn may be inspired by what I have written and start out on a path of their own. That is how it worked - and still works - for me. There is some tremendous material out there on the mathematics of sport and every day you can learn something is a source of joy, if you have a mind like mine.

Thanks again, September '12.