Bayes Theorem in Horse Racing – A Beginners Guide
We see discussed many different types of betting strategies and a systems, including on this blog. But rarely do we see discussed how the big bettors build their models and how betting teams develop their odds lines.
There are a couple of reasons for this. The first is that there are only a few people who actually know, and the second is that the amount of resources required to build something similar is far beyond the means of most people.
What we can do though is take what those with large resources are doing and scale it down to work with the resources that are available to us. That is what I’m going to do today.
As I write this I am not sure if it is going to be a single article or a series. It depends on how the rest of this article goes and more importantly the response to it. So let me know if you’ve enjoyed it and want to see more by leaving me a comment or asking a question below!
What we’re going to be looking at today is Bayes Theorem, and my goal is to explain it without using any technical jargon or difficult mathematics that you see in so many explanations. I’m sure you will let me know if I’ve succeeded at the end.
To start I think it is appropriate to explain exactly what this theorem is. That gives us an underlying knowledge of what it will help us to achieve and how we can use it in horse racing.
As with most statistics this was not invented specifically to help us make more profits from betting at the races, it was invented as a statistical method of working out probabilities. If you want a more detailed explanation then check out wikipedia, but we do not need to know more for our purposes.
With this knowledge we can see that the purpose of using this in horse racing is to create a probability for each horse in a race using a repeatable method that does not rely in any form on gut instinct.
The way this method of creating a probability work is that it starts off with each horse being given the same chance of winning a race. In a ten horse race this would be a 10% chance of winning or 0.10 probability. In a 7 horse race each horse starts with a 14% chance of winning or a 0.14 probability.
As more information is added to the model these probabilities increase or decrease relevant to the what the information has told us about a runner.
It’s all fairly simple so far. The secret ingredient is how we add the information in a way that allows us to adjust the probabilities for each runner relevant to what the information has told us.
We do this by using something known as likelihood ratios. I don’t count that as technical information because it is a name and it would be wrong of me to call it anything else in case you wanted to research this further 🙂
Working out these ratios is very simple. We decide what information we are going to add, for example last time out winner, and take all the horses we have in our historic races and split them into two groups:
- Those who were historic winners
- Those who were not historic winners
We then break each of these two groups down into two more groups where are:
- Those who won their race
- Those who lost their race
You will now have four groups of numbers:
- Last time out winners who won their race
- Last time out winners who lost their race
- Horses who didn’t win last time out but won their race
- Horses who didn’t win last time out and lost their race
This is all the information that we need to create our likelihood ratios. And if you can get this information and have a normal calculator to hand, then you can create probabilities for horses in a race without any difficulty.
In the next part of this article, which I think is going to be necessary if this isn’t going to turn into a huge multipage document, I will show you how to perform these calculations on a simple calculator. Doing so is going to allow you to create odds lines and probabilities in the same way that big bettors and major teams do!
But you need to be prepared. Before the next part make sure that you have…
- Chosen which factors you want to use to create your probabilities and odds lines ( no more than four or five to start)
- Get some historic data and break it down as shown above so you’re ready for the next stage
If you would like to see me do a real sample next week with real data that you can put to use, then leave a comment below with you vote what factors you would like us to use. Okay, I’m using this as a bit of a bribe to get you to leave me a comment, but I love to hear from you and if enough people leave me a comment then we will give you some real world data broken down for you to use in betting in our next part in this series.
Good Morning Michael,always enjoy your articles and use your UKR news site ,I am not a great fan of systems but I look forward to your sample next week with your preferred indicators ,rgds Ray
Thanks Ray, I’m looking forward to putting it together. Maybe a couple of weeks until the next part of this series as I have to compile the data, but it’s coming!
In the old days Combayes did this very well – any attempt to recreate this would be welcome.
Thanks Peter, I don’t remember Combayes, was it a software program using Bayes specifically for Horse Racing?
interesting, look forward to the real sample next week
what are we going to do with our 100% odds line ?
Thanks David, may be a couple of weeks before the next part as it takes a while to get the right data and compile it.
>>what are we going to do with our 100% odds line ?
Use it to bet overlays and generate long-term profits. 🙂
Michael, you’ve got it, now I am hungry to know more about Bayes theorem and how to apply it to horse racing, thank you for the educational effort you are doing for us.
Thanks Oscar, more will follow shortly.
Thanks Oscar
Hi Michael you certainly know how to wet the appetite, what do you think to building a model that has no rating influence and solely relies on the conditions of a race to create probabilities. By the way I have enjoyed all your articles especially the pace and power rating articles. Keep up the good work.
Thanks Richard, an interesting suggestion. A good way to use the conditions of the race is to use them to determine the profiles of horses which have a strong chance of winning.
Michael, be great to see this develop further and to improve on my success rate.
Thanks John, I shall be developing this further and let’s see if we can’t increase your success rate.
Michael
This looks promising. I have some factors I work with but I don’t really understand what the real chances are. Hope this helps
KRs
Sandy
Thanks Sandy, this is certainly going to help. Get to grips with the calculations and you will find it an easy process.
Hi Michael
Great start, and a good read as always.
Thanks
Paul
In some material you read, the number of factors can be up to 30 or more…but for the average person this is almost impossible to track…and it seems this depth is mainly for betting on the exotics (quinellas, trifectas etc)…so, I agree with you that maybe 4 or 5 factors should be enough. How about career win percentage…days since last start…runs at this track….
Career Win Percentage is one of my favourites Duncan, days since start is also good but I prefer days since last good start. Runs at the track I haven’t found to be as important a factor, however good runs at this track is ;(
Michael,
Interesting stuff, potentially, but I think your post needs slightly more clarity in your example. Specifically, the categories – “Last time out winners who won their race” … er … which race? Am I missing something? (Has been known!)
Cheers Pete
Thanks Pete, there will be more details in the next part.
Would be interested to see more. Factors like course and distance winners, winners on the going please
Thanks Nic, shall let you know when it is ready 🙂
Keep up the good work, everytime I read one of your posts my own betting is brought back in check. I don’t bet often & when I do it’s for very small stakes but I do like the thrill of pulling out nuggets & your insight always helps. Please post your sample data
Thanks Patricia, I am working on getting the data at the moment and shall be writing the next part explaining it as soon as it is ready.
I sense the basis of another system! Tell me more – soon please. 🙂
Absolutely Keith, need to get the data compiled and then shall share more!
Don’t know where this’ll lead us, Michael, but it’s got the old neurones buzzing, which at my age is a damn good thing.
Thanks Jack, pleased to hear the neurones are buzzing, that’s my aim!
Could you give an example, because the system sounds very interesting but a bit complicated.
Thanks a lot Don.
p.s there seems to be a lot of systems out there and I have being taking in by a lot of crooks, that’s why I like your articles.
Thanks Don, it’s not too complex but you may need to go through it a couple of times before you get into the routine of doing it. The best thing is to get to grips with the calculations, even make a spreadsheet to do them for you, and then it will be a piece of cake. And you always know where I am if you have any questions.
Excellent post. Very clear description. Like to see this applied. In terms of factors recent form is one I always consider critical. I would be interested in how you take factors and turn them into likelihood ratios. Hope this topic keeps going.
Thanks Simon, I will be writing the next part as soon as I’ve got the data compiled.
Maybe you could use CD winners in the factors.
Thanks Graham, that is certainly a factor worth considering.
Hi Michael. I always enjoy reading coresspondence from you, even though I’ve obcessed with horse racing for over 60 yrs. One factor I’ve always employed is a stable’s level of recent form. This can be vital. Cheers Terry.
Thanks for the comment Terry, it’s great to hear from you. Recent form is very important, it’s also always good to keep an ear to the ground in case a stable gets a virus etc…
great research keep ot coming
Thanks John, more coming soon 🙂
is there any mileage in using trainers only runner at the meeting?
That’s something that is interesting Ian. I’ve never found it to be much use on its own. If it is possible to combine it with distance travelled and race patterns then it can become useful but in my experience on its own it has very little impact.
Hi Michael, I’m one of those strange people who love mathematics but I seem to have fallen at the first fence.
Which race has the last time out winner lost? If you could clear up that little mystery I’d be eager to learn more.
Cheers, Paul
Thanks for the comment Paul, I apologise for my confusing terminology. What I mean by this is that we have a sample of horses who won in their last race. Some will win the race they are about to run in and some will lose. I hope that clears it up 🙂
I would like to see this worked out.
best regards.
Gary
Great to hear from you Gary. I shall let you know when the next part of the series is ready!
Give us some more
Looking forward to the next bit of info
Many thanks Michael
Michael
Do you have any of the above info on Dogs (Greyhound Racing)
if so i would like your take on how you assess the doggies
MHols
KEEP UP THE GREAT WORK
Thanks MHols. The next part is going to be awesome, getting the data together at the moment. I’ve never really focused on dogs because there isn’t enough liquidity in the market for me. I know a man who does though and as I understand it the key is assessing who is going to be at the front on the first bend, this is the dog that wins a significant proportion of the time and looking to be able to accurately assess this would be my starting point if I were going to analyse dog racing.
I’m am hopeful that this idea of yours may be a 1st from these emails. All any punter that reads these things wants to no is exactly what a pro punt (REALLY DOES). Not a round the houses explanation/theory that never really tells use anything. Example if I have a leg injury and get it checked out. I want to no what course of treatment would be available to the rich like David beckham not just what is available on the nhs. I then can make a decision if I can afford to have the best treatment. Or in punting terms afford to set myself up with the technology needed to copy/follow in the footsteps of the best.
Hope I’m not just deluded.
Steve
Thanks Steve, a great post. It’s also important to remember that personalities are very important. What works for one person may not work for another because of personal risk levels, stake sizes, time available etc… The best way to approach betting is to soak up as much information as possible. Choose a small set of race conditions that you enjoy the most and then focus on using the information you have gained to develop your own approach. This approach will be tailored to you specifically, once you have a bare bones you can ask for advice and most pros will be able to tell you how they would adapt or investigate your strategy to make it better or bring it into profit.
Very interesting theory. Would very much like to see your example based on this theory.
Peter matthews
Thanks Peter, the second part will be coming soon!
the term historic , does this mean course and distance winner, just distance winner or just course winner , does it mean in the horses career or just this season or last season or any season the horse has run in ?? cheers thomas
Good question Thomas, and it can mean any of them. I am compiling the data for my example at the moment so I haven’t decided which factors I will use but using this method you can use any factor you feel is important or investigate a lot and choose the best.
Hi Michael. I don’t know any thing about Bayes Theorem and not much good at maths having left school at 14. Is all of this necessary for finding the winner? I am always looking for and reading your articles and might say that I have learnt a little from them, but at my age not a lot sinks in. However I look forward to an example and any information that will make me on the best side from the bookie.
Thank you for your past writings
Thanks for the comment Jack. Is it necessary to do this to find the winner? No. It is just one approach of many. If you are a good form reader then that is all you need. Personally I use a more mathematical approach because it removes the emotion from the betting and the work is done in the weeks and months before a race rather than on the day of racing itself. That makes it less stressful for me. But that is certainly not the only way to do things.
I’m confused. How can a last-time out winner have lost his race?
Paul asked the same thing Charly, here is what I wrote…
I apologise for my confusing terminology. What I mean by this is that we have a sample of horses who won in their last race. Some will win the race they are about to run in and some will lose. I hope that clears it up 🙂
Hi,
would like to see going/distance/draw/class/speed
Thanks Doug, can’t guarantee what will be in it yet but I shall certainly put them into the pot to be considered 🙂
I am sure we shall get some of those factors in their Doug
Michael
What about using won at disatnce and won on going?
Regards
AP
Hi Anthony, I hope you’re well. Thanks for the comment. They can both be quite effective if used with the correct information, I shall look at the possibility of putting them into the sample.
Michael
I am always willing to try anything that will further my Racing knowledge
so would be happy for you to illustrate the technical side of the calculations.
Thanks
Graham
No problem Graham, next part coming up soon!
Good recent form is always my favourite starting point so finishing position LTO and days since last run would be my suggestions for analysis.
Thanks Brian, I shall look at the possibility of using them 🙂
Good post for whetting the appetite for more on the Theorem!
Key factors for me are recent form, race class/value, ground conditions and distance.
I would like to hear more of this. Where can I see the other 66 comments? cant seem to access them? thanks alan
Thanks Alan. You should be able to see the other comments below the article where this is.
Thanks Michael, didn’t realise the comments came up when I posted my comment. Doh!!
Hi Michael,
Having delved into your website I was truly interested in the ” Bayes Theorum “.
Did you follow up with the whole piece on this method? If you did is there a link to the complete article?
Warmest Regards
Joseph
Thanks Joseph, I haven’t written the follow up yet but am working on it. If you’re on our mailing list then I’ll let you know when it’s ready 🙂
Great information, and a super way to start your own morning lines. Too many people are afraid of the math involved, but the more you use it the more it becomes second nature to you. Some days all I use is a pencil and the posted odds, it allows the freedom of observing what else the track has to offer and the opportunity to work on the ‘visual’ aspects of racing.
I would want to factor in previous winners and placed horses on current going
Very interesting and looking forward to the next chapter, it would also be very useful to see you use the formula live. I use Sire, Trainer, Jockey, speed and going.
Hi michael,
this looks very interesting.
The factors i would look at would be distance.
Hi Michael, I think this could be very interesting. I think the main factor would be the going in any calculations.
Thanks Mark and Mick, I will consider both of those factors 🙂
Sounds interesting way to get the probabilities ratio as accurate as possible, I would add horses who won last time over distance and or course, also dont know if it would be possible but if time of last race over course and distance, finall either a spreadsheet or software to help work it all out.
Hi Michael,
Good Read………..
how about using days since last run say 10-50 days or LTO 1st to 6th.
Can these be used?
We could use both, why that specific timeframe? LTO is something we are definitely considering.
Hi Michael,
Try as I might but I just can’t understand those four statements above!
1. Last time out winners who won their race
2. Last time out winners who LOST their race
3. Horses who DIDN’T win LTO but WON their race
4. Horses who didn’t win LTO and LOST their race
Please Michael, how am I supposed to understand?
I do enjoy reading your work, but hey please explain what you are getting at here.
I also know that you have a sore back and I wish you a speedy recovery.
Kind regards
Ramsey
Thanks for your comment Ramsey. I’ve been asked this a few times, sorry to not write it clearly 🙂
What I am talking about is whether a horse who won last time out go on to win their next race, which would be the race we are looking for winners in. I hope that clears it up.
This is something I have been working on as well – but slightly different criteria. What about adding into your calculations the following – Place of Race meeting, how often Favourites win at that particular race meeting, The most common distance that Favourites actually Win or Lose?
With these additional factors perhaps the likelihood of a Favourite winning or losing might be determined.
Just my thoughts
Ian
Good thoughts Ian and they can also be used effectively. It is important not to use too many factors in each individual model however otherwise we get too much correlation. If we are going to do that we need to combine similar factors together first.
I enjoyed the article I look forward to more of the same.it is clear that I do not have a clue about the racing game in any depth.
Thanks Mike
How do you actually see the comments?
It says 86 on this topic but shows none.
You need to have an approved comment to see the other comments. This is something we are changing in the new design of the site but you should be able to see them now 🙂
Hi Michael.
I think this could be very exciting indeed, I think we have to think differently in terms of factors used.
Factors You can consider:
Average speed over that race distance, So check the horse for times it has run over that distance and then do distance/Time of the race.
The amout of prize money won in the current season.
The overall winning percentage of the horse in its lifetime
etc….
Thanks Jordan. As you say it is always a good idea to think outside the box for the factors that we’re going to use. I shall take a look into them.
Hi
Have just noticed your latest article on Bayes Theorem and read the two articles written.
Last year I devised a system using Bayes Theorem and had a modicum of success. The reason I used this method was because like you I know you cannot use form as a basis for selecting winning selections. Horses, like humans have their good days and their bad.
Exclude form and what are you left with? Statistics! The use of probability theories was the path I also followed.
Statistics are gained from past performances. I have statistics relating to the performance of favourites in nearly every race over the last 8 years. (68000+)
My theorem was calculated on four statistics gleaned from those 8 years worth of data. Namely, percentage of favourite wins over the eight years at the course. Percentage of favourite wins for the number of the race (first,second, etc). Number of runners in the race. Percentage of favourite winners for the number of horses in the race. No calculator for me I develop and write computer programs. To run this system I downloaded race cards from the internet. Ran them through my ‘Bayes’ program and printed the results out. Ten to fifteen minutes a day.
As I am more interested in developing systems I got bored with it and moved on to other pastures.
One of your respondents asked about number of factors you can take into account when assessing selections. I once wrote a system using data produced by Adrian Massey (regrettably not available now) which had around 100 factors which assessed the form of a horse from its first race to its last. This was a really successful system which was cut off in its prime. I had one or two 100/1 winners and
several classic winners.
Thanks for the comment and info.
Is there anything further on this Michael. I well remember Combayes, I think it was one of the first computer racing programs in existence. I’m pretty sure I bought it at the time.
But I have my own factors which are reasonably successful, and I would like to be able to neaten them up with some mathematical calculation. I’m sort of addicted to them.
I haven’t got anything further planned at the moment but may look at more in the future. Is there something particular that you were interested in regarding neatening your factors?
Is there any programs out there using Bayes Theorem today?
There are programs that use Bayes Theorem, but none I know if specifically for horse racing.
Thanks, sam