# How To Choose Factors That Predict Winners

When you start to analyse a race do you know what factors you should be using?

Do you know which are the most important for the race you’re analysing?

All races are not made equal. Which means every race is likely to have different factors that are important.

If you think about factors you probably can initially only think of a handful. Some of those may be speed, pace, class, form and connections.

But they’re not really factors.

They’re categories.

The factors are what fall into each of those categories. If we take speed, then some of those factors may be…

Recent Speed

Average Speed

Best Speed Over Going Conditions

Best Speed Over Distance

Average Speed Over Going Conditions

Average Speed Over Distance

And even these can be broken down into more factors. Taking Recent Speed we could have several definitions of what that means, such as…

Recent Speed Over Last 7 Days

Recent Speed Over Last 14 Days

Recent Speed Over Last 30 Days

Recent Speed Over Last 60 Days

Recent Speed Over Last 90 Days

And even then we could have several definitions of what “Recent Speed” actually means. We know it means how horse the fast has been running recently, but how do we define that. It could be defined as the fastest the horse has run over the timeframe, or the most common speed weighted towards the most recent race or at least another two or three variations.

What we’re left with now is a huge number of factors.

When we analyse a race we want to try and get the number of factors we’re using to around 10 and certainly no more than 20.

So how do you determine which are the best to use for the specific race conditions you’re analysing?

You could simply use your gut instinct. And there’s definitely nothing wrong with that. It will be quick and the more your do it the better you will get at it. However there is a learning curve there and it will take time to implement it every day. If you rush then you’re likely to make mistakes.

But today I want to share with you the approach that I use to finding the factors I’m going to use to analyse a race.

I start by considering the race conditions. I look at the:

Race Type

Course

Number of Runners

Distance

Class

Going

With that information I go and find a range of past races that match similar conditions. If there are too many to choose from then I will add the classifications for the race into my filter, if there aren’t enough I will look at races over a slightly longer/shorter distance and with slightly more/less runners.

When I’ve got a handful of races, as a general rule I like to get as much data as possible but if you’re doing it manually then you should have at least ten races.

Once I’ve got those races I then look for the horses that contended. These are the horses that performed well, not just the ones that won the race.

I always prefer to use contenders rather than winners in analysis because whether a horse won the race or not can come down to luck. If they get blocked at the final, then they lose, if they get bumped… they lose. But that doesn’t mean they didn’t run a strong race and if that hadn’t happened they wouldn’t have won.

Once I’ve separated out those horses that contended in each of the races it’s now time to look at all the factors you have them and determine the strongest.

And how do you do that?

Start by breaking down your factors into twelve “buckets”.

If your factor was a ranking factor at listed horses from best to worse with 1 being the best, then there would be buckets going from 1 – 12.

This is the easiest approach and why I would suggest you start by using only factors which score horses in the field from best to worse. Your twelve “buckets” would then be…

<=1

2

3

4

5

6

7

8

9

10

>=11

Horses With No Rating

Never forget to include the horses with no ratings. This can give as much information as horses that do have a rating.

To determine the most important factors to use in the race you now calculate a Chi Score for each of those buckets.

The result of this will tell you which of your factors are likely to have the most impact on predicting the winner of the race!

If you’d like to know how to perform the Chi Score calculation then leave me a comment and if there’s enough interest I’ll write it up in next week’s post.

I would love to know how to calculate the Chi score, and thanks again for another interesting topic

Thanks Richard

Yes always interested in other ways Michael, so would be interested in how to calculate the Chi Score.

Thank you Wendy 🙂

yep bring it on Michael, thanx

It would be very interesting to see. Thanks

Think Michael covered this here http://www.cyobs.co.uk/using-chi-square.pdf

You’re absolutely right, I did 🙂 I will re-write again in case it needs bringing up to date.

Interesting stuff again Michael – thanks.

Unfortunately (for you) it highlights the labour intensive element which is identifying and analysing past similar races. Unfortunate for you because it prompts me to ask how you can facilitate the ‘match similar & analyse’ function within the ratings. Many of us simply don’t have the time to carry out this work within a reasonable time frame across a sufficient number of races (many races will reveal no meaningful pattern at all)

For what it’s worth, and with the above in mind, I apply the same few criteria to all races, manipulate the ratings to produce a top three for each race. I’m currently achieving around 60% (variance of +/- 10%) strike rate for the top three with a LS loss of around 10% at SP. I suspect that with pattern matching and using Betfair I could turn this into worthwhile profit. I’d be interested to know your thoughts.

As you say Keith, it is labour intensive. I will however, look at a way of being able to do it and reduce that labour. Interesting that you’re already using it. At a 10% loss to SP you would probably be near break even at Betfair SP or BOG, which is a great start. Using pattern matching you should be able to increase your strike rate to 70%+ with top three 😉

And is it possible for you to show a working example of the factors/buckets please?

Thanks

Absolutely. So let’s say we’re looking at Factor 1 and the ratings go from 0-100 with no decimals. The buckets would be 0-10, 11-20, 21-30, 31-40 etc…

You would repeat this process with factor 2 etc… ideally you want to break your buckets so that there are a similar number of runners in each one. That means if a rating goes from 0-100 but most fall between 40-60 you would have bigger ranges outside of that and smaller inside of that to try and even out the number of runners in each bucket.

Yes always interested

Yes always interested