Power Ratings, A Different Approach
Following my article about working out the minimum odds you should bet at last week there were a lot of requests for a follow up about how to calculate power ratings.
The truth is, there are loads of ways to calculate power ratings ranging from overly simplistic to overly complex.
Which is the best?
Well, to be honest… none.
At least I’ve never found one approach to be significantly better than another.
But what I have found are approaches that are more suited to my personality than others and so I generally stick with those.
There’s a load of articles about how to create power ratings floating around the web and 99.99% of them look at weighting each rating differently, multiplying the rating by it’s weight and then combining all the results together.
Doing that means that some ratings count a lot towards the final figure and others only a small amount.
The reason that all articles about power ratings focus on using this approach is because… it’s pretty much the only way to do it.
The combining of the ratings together is, to be honest, the easiest part. It only takes a moment.
What’s difficult is determining the correct weight, or importance, that each rating is going to contribute to the final power rating.
Today I thought I would look at an approach that you may not have seen before. It’s an approach that can be very effective but, fair warning, it can take some time to do and get your head around.
Like everything though, if you spend the time then you’ll reap the rewards.
The first thing we’re going to start with is by converting each rating into a probability that the horse will win the race based on that rating alone.
And to do this, we’re going to use a tennis tournament.
Of course I don’t mean that we’re actually going to get up and play a tennis tournament. I mean that we’re going to use the statistical tennis tournament.
This is where each horse competes one-on-one against each of the other runners in the race.
Let’s start with a five horse race where the horses for one of our ratings look like this:
Horse A – 100
Horse B – 97
Horse C – 88
Horse D – 76
Horse E – 84
We now make a list of all the possible pairings and the difference between the ratings for each of them and mark which of the horses finished ahead in the race. The easiest way to mark this by putting a 1 if the first horse in the pair finished ahead and a 0 if it didn’t.
A/B +3 1
A/C +12 1
A/D +24 1
A/E +16 1
B/C +9 1
B/D +21 1
B/E +13 1
C/D +12 1
C/E +4 1
D/E -8 1
I have worked this out by taking the horse on the left’s rating and subtracting from it the horse on the rights rating. For example…
A/B = 100 – 97 = +3
Horse A finished ahead of horse B and so we put a 1 in the third column.
Now we want to do the reverse of this, we’ve done A/B so now we do B/A…
B/A -3 0
C/A -12 0
D/A -24 0
E/A -16 0
C/B -9 0
D/B -21 0
E/B -13 0
D/C -12 0
E/C -4 0
E/D +8 0
Once we have this data we now need to work out what a gap in the ratings means to the horse’s chance of finishing ahead in the race.
There are, of course, numerous ways of doing this. The best is probably to use a logistic regression. That’s pretty complicated but there’s a simple way to do a linear regression, which is nearly as good, in Excel.
I’m not going to worry about how a linear regression works, I don’t want to bore you. What’s important is what the result of it means.
What we’re about to do is create a formula which will allow us to calculate how much a gap in a horses ratings is going to affect it’s probability of winning the race based on this rating alone.
To do this put everything we’ve done so far into your Excel spreadsheet:
Select all the numbers in column B and column C then press the Charts tab, followed by Scatter and select Marked Scatter:
This will put a scatter graph on your spreadsheet that look similar too…
Next go to Chart Layouts, Trendline and select Linear Trendline:
Doing this will put a line through the middle of your chart like:
Next we need to get the formula for this line to display. We do this by going Chart Layouts, Trendline and selecting Trendline Options.
You will see a popup like below appear:
Select Options and put a Tick in the Display Equation On Chart option and then press OK.
You’re graph will now have an equation displayed on it:
This is what we want. I know it looks like complicated, but it isn’t and I’m going to show you exactly what to do with it.
This equation will allow you to only have to perform this analysis once, so make sure you have the best data to do it from. Because from now on you can always use this formula to determine the horses probability of winning a race for this rating.
We do this by taking our formula and adding in our horses ratings. These were our horses rating gaps…
Pairing Difference In Ratings
A/B 3
A/C 12
A/D 24
A/E 16
B/C 9
B/D 21
B/E 13
C/D 12
C/E 4
D/E -8
B/A -3
C/A -12
D/A -24
E/A -16
C/B -9
D/B -21
E/B -13
D/C -12
E/C -4
E/D 8
The small x in our equation, after the first number, is where the rating gap for our pair goes. So the pair A/B would be…
A/B Rating Gap = 3
Formula is: (0.0279 x 3) + 0.5 = 0.58
That means horse A has a 0.58 probability of beating horse B. If you prefer to work in percentages you can multiple the result by 100 which shows that horse A has a 58% chance of beating horse B based on this rating.
If we do this for each of the paired horses we get the following results.
In my next article we’re going to look at how we combine the results of our equation into a single probability of each horse winning the race based on this single rating alone.
Work through this with some real-life examples of your own. You can download the spreadsheet I used in this below:
As always, if you’ve got any questions on this approach then please leave a comment below.
whats the point of all this nonsense when the handicapper does all the work before and after the race ?
Thanks for the comment Peter. The purpose is to uncover where the handicapper has got it wrong so that we can make a profit.
Michael,
You need to make your spreadsheets dynamic and add ‘lookup’ or ‘index’ formulas to reference names to your pairings otherwise it is just too time consuming. Users should be able to paste (or type) names into the sheet and the ratings should calculate automatically.
Have you compared a decent sample size (maybe 2 or 3 thousand) against Official Ratings using this process? Are there significant differences between the two? Do certain types of races perform better than others?
Regards
Chris
Thanks for the comments Chris. If I did that then I wouldn’t leave anything for you to investigate 😉 I use a very similar process in my own betting which are built against over 10,000 runners. As always, different race conditions do perform differently depending on the factors that you’re using.
Hi Michael
Thanks again for the work that you do, you come up with more and more ways to arrive at a winning selection.I don’t think there is anyone out there that gives this kind of information freely, keep up the good work.I visit your site every day to look at your innovative methods, sometimes people just have to experiment with those idea’s to make it fit for them.Over the years that I have been following the blog, I have learned new ways of using excel, that alone is priceless.Many thanks Michael.
Thank you very much Richard 🙂
Michael,
Can you just confirm something for me please – probably a stupid question but I find there the ones that need asking.
When you say ‘…and mark which of the horses finished ahead in the race’ are you talking about the theoretical or future race as illustrated by the ratings?
Cheers
Not a stupid question at all! This is done on your historical ratings so that you can build the equation you’re going to use on future races.
Sorry Michael,
I know I’m being thick but let’s say I have some ratings from Racing Dossier for a race today – say PFP for example:
A 1527
B 1525
C 1504
D 1502
E 1502
F 1493
G 1492
H 1485
So could I simply apply your working to those ratings?
And would it look like this
A/B 2 1 where 2 denotes the difference (1527-1525) and 1 shows that A beat (is a higher value) than B.
What you should do is take a sample of historic ratings for the race conditions you are interested in and then apply this process over a large sample of data, one race isn’t enough. Follow exactly as above for as many historic races as you can. You can do all the races at the same time though, so instead of 21 rows you may have a few hundred rows which relate to different races. When you do the graph and trend line this will now be based on a much larger sample and so your formula will be far more exact. Once you have the formula based on a larger sample of historic data you can then apply it to current races just by taking the gaps between the ratings for the horse in the race and applying the formula you made to them. Let me know if you have any more questions.
And what happens where ratings are the same as with D and E? Would you apply a 1 for a win or a 0 for a loss?
It depends on which horse one. In the case of D/E if horse D won then it would be a 1 and if horse D lost it would be a 0.
hi michael, when you insert the second article will cover how to use the rating obtained ???
I shall look art some ways you can use the rating obtained Yago 🙂
Hi Michael
is there anything out there that would do the original scores for you
I have over 4000 Flat handicap races but doing all the calcs would be very time consuming
There isn’t currently anything out there to do this but it should be possible to write a macro in Excel to process this for you. You can almost certainly find someone who can do that on one of the freelancing sites for a few pounds if you’re not into programming yourself 🙂
Hi Michael,
When are you going to follow up with the second part of this article so that we can see how to calculate odds from the pairwise probabilities.
Thanks
Rob
Hi Rob,
Thank you for the message. I’ve actually started writing it and it will be published in a couple of weeks 🙂
Interesting article. Would you suggest doing this for different race distances (or close groups of distances) seperately? IE, Handcaps over 5f+6f then handicaps over 7f+8f, or is it something you can just put all your flat handicap ratings into and get a workable result for all?
I would recommend you do this for close groups of distances, e.g. 5f and 6f. You can even segment down by race type, ground conditions and class as well. Generally the tighter the conditions the better the model, but make sure you have enough data for those conditions to make it valid.
Hi Michael,
What is the minimum number of races that you would say is enough data?
I like to use 100 winning horses in the sample as an absolute minimum