Quantifying uncertainty in sports fixtures
14 hours ago
For rugby fans the long wait is nearly over, like Christmas the Six Nations comes once a year to lift our spirits in the cold winter months. If you’re not very familiar with rugby, the Six Nations is an annual tournament where the top national sides in Europe (England, France, Ireland, Italy, Scotland, Wales) each play five fixtures alternating who plays at home or away each year. All teams compete to win, but the most coveted prize is a ‘Grandslam’ — where a team wins all 5 of their fixtures. Given how competitive the tournament is a Grandslam is reasonably rare, and since the tournament was expanded to six sides in 2000 there have only been 13 Grandslams of a possible 25.
This year, in the 2025 tournament, Ireland come into the competition competing for a third consecutive series win with stiff competition from France, who’s domestic league (The Top 14) has been electric this year in the European Champions Cup.
With that in mind, and given that roughly half of tournaments have led to a Grandslam, how likely is a Grandslam in 2025? In this short article we’ll explore how we can use previous fixture results and other information to make a best guess at how likely a Grandslam is. We’ll be focusing on linear models, and we’ll explore this from both the Frequentist and Bayesian Perspective. The models are built using SciKit-Learn and the Bayesian modelling library Bambi (which is built on top of the excellent PyMC framework).
Read on to understand how and why I estimate the likelihood of a Six Nations Grandslam to be around 30–40% in 2025.
In the age of AI people are increasingly used to mapping inputs to outputs with highly accurate predictions. Whether this is using LLMs to generate natural language responses, Computer Vision models to tag images or even Auto ML to predict tabular datasets it is increasingly taken for granted that these models just work.
In spite of this, the relationship between inputs and outputs naturally involves a level of uncertainty — and when you are working with small or noisy datasets, like you often see in sports, it is important to attach an estimate of uncertainty to your predictions. For example, the opening fixture of the 2025 Six Nations France host Wales at home — we may predict that France will win, but how confident are we about this?
The dataset used for this analysis is sourced from publicly available resources, such as Wikipedia. The challenge with predicting 2025 fixture results is that the out-of-sample predictions are based on panel data, and team form generally fluctuates across the years as squads and managers change.
In our publicly sourced data we gather stats from 2020–2024 including:
- The age profile of squads
- The experience of squads (i.e. number of international caps)
- The number of distinct club sides that make up a national squad
- Previous table position
- Previous fixture result
- Whether there is a change of coach since the previous tournament
The data preparation here is done using Pandas. Figure 1 shows how we merge the data on a fixture level basis, incorporating information about the squad for each year of the tournament. Looking at this we can see that in 2025:
- Ireland have the oldest squad with a proportionally high number of caps on average. This tells us that the squad is highly established and, since Irish rugby is provincial, the squad is made up of only four sides. Given the age profile of the side and that they have a new coach for this tournament there may be uncertainty over whether they may be at or near the ‘peak’ as a squad
- France have one of the youngest squads on average and, on average, the lowest number of caps. Despite this they have been performing exceptionally well, and came second in the 2024 tournament suggesting their squad is on the rise
- England have the second youngest squad, but proportionally more caps on average suggesting they are trying to balance youth with experience in the 2025 tournament
- Scotland have the second oldest and one of the most capped squads in the tournament. They have an established side and, arguably, underperformed in 2024 where they came in fourth place. Their side may be nearing its peak before they go through a period of rebuilding
- Italy are in a similar position to Scotland in terms of average number of caps, but with a slightly younger age profile. There has been a number of changes in management over the years but come into the competition this year with an established squad and the same coach. They might surprise people this year
- Wales are in a period of rebuilding and have a young and inexperienced squad and underperformed in the 2024 tournament where they came in last place
Since we’re using linear methods to predict results, I created a binary flag for whether or not the home side won the fixture, and for each fixture we’ll predict the probabilities of the home side winning (i.e. yes/no). The probability of not winning at home is, implicitly, the same as predicting that the away side win.
Before building a predictive model, it is important to do some exploratory analysis. Figure 2 shows the correlation plot for the features.
As you might expect, where you finished last year is highly correlated to winning this year. Likewise, your squad profile is highly correlated with winning. Having a change of coach is correlated, but not as strongly — though this may be because there are proportionally fewer instances where this happens between tournaments.
An important consideration here is whether there is correlation amongst the inputs (features) of the model, since autocorrelation can negatively impact model reliability. We can see here that there is a strong correlation to the age and number of caps, this is intuitive since older players will (on average) have more caps. To accommodate this we replace these inputs with a composite feature which represents the proportion of caps to age. We also remove a few of the less correlated inputs from the model, since often less is more when fitting a model to avoid overfitting.
Once we have identified the features of our model we can prepare the data for training. Since this is a panel data problem we split the data as below.
Model Validation: We start by validating the model and getting an estimate of out-of-sample accuracy. To do this we back-test on previous tournaments
- Train dataset — fixture results 2020–2023
- Test dataset — fixture results in 2024 tournament
Model Predictions: We can create our predictive model for 2025 for out-of-sample predictions as
- Train dataset — fixture results from 2020–2024
- Prediction dataset — upcoming fixtures for 2025
We prepare the dataset for modelling using:
- One-hot encoding for fixtures
- MinMax scaling for numeric features
It is important to apply the scaling on each dataset separately to mitigate the risk of data leakage.
We can create our Frequentist model using SciKit-Learn’s Logistic Regression classifier. Figure 3 shows the Confusion Matrix for the back-testing on 2020–2024 fixtures
In Figure 3 we can see that the accuracy of the model is around 73%. You may be wondering why there is a total of 30 fixtures for the 2024 predictions when there’s only 15 fixtures each tournament? The reason for this is, in order to improve model accuracy, we stack the data so that we get a Home and Away result for each fixture. This is because sides only play each other once per year and swap home and away each tournament. We, as humans, understand that France v Wales is the same as Wales v France, but the model cannot directly understand this. To do this we swap home and away, and then swap the binary flag for home win, preserving the integrity of the data.
For example:
- 2024 Wales v France → HomeWin = 0 [original]
- 2024 France v Wales → HomeWin = 1 [inverted]
Using our out-of-sample predictions for 2025 we get the below win probabilities for the upcoming 2025 tournament.
In Table 1 we see that:
- Ireland are expected to do well based on previous form and a chance to get a ‘three-peat’ (third consecutive title)
- France are expected to do very well, particularly at home
- England have a reasonably strong chance, but in all likelihood will finish mid-table
- Scotland are expected to have the slight edge in the Calcutta cup again this year, but it will be tight
- Italy and Wales will be expected to compete to avoid the wooden spoon, with Italy expected to be slight favourites
Once we’ve estimated the probabilities for the fixtures, we can use Monte Carlo methods to simulate the tournament and estimate the likelihood of a Six Nations Grandslam. Monte Carlo methods use random sampling to estimate probabilities and quantify uncertainty.
To do this we run 10,000 tournament simulations making a random choice seeded with our win probabilities. To do this we use Numpy’s random choice method for our set of home and away fixtures with the corresponding win probabilities. Figure 4 shows us a violin plot for the simulated number of wins per tournament per side
It’s worth noting that these points are jittered to improve the aesthetics of the plot, but overall, we can see from Figure 4 that:
- France and Ireland are clear favourites to win, though based on past form Ireland might be expected to be more likely to win a Grandslam
- It’s important to note that past form doesn’t always predict current form, for example Ireland have a new head coach, the oldest team and are looking at a rebuild phase following the retirement of their key playmaker, Jonny Sexton
- England and Scotland could cause some upsets, but are likely to be battling it out for the upper-mid table position. Based on recent form Scotland are more likely to get 3 wins and England 2 wins, but there is more uncertainty on how England could do in the competition
- Wales and Italy are likely to be scrapping it out for the bottom of the table, with both teams fairly likely to pick up at least one win in the tournament, though this may be the Italy-Wales fixture, which Italy are possible favourites for given home advantage in 2025
Overall, this model appears in-line with what many pundits have said about their expectations for the tournament. One limitation of this approach is that we’re making the assumption that the win probabilities of the fixtures are normally distributed around the point estimates from the Logistic Regression model. This may be a strong assumption.
Another assumption of the model is that the outcome of a win in one fixture doesn’t affect the win probabilities in other fixtures, i.e. that fixtures are independent. Personally, I don’t think this is entirely unreasonable since this is professional sport, and sides are coached to have a winning mindset in each fixture — and often sides are inconsistent between fixtures. For example, Scotland performed very well against England in 2024 but went on to lose subsequent fixtures and England went on to beat Ireland who ultimately won the tournament.
We can avoid making strong assumptions on the distribution of win probabilities across the tournament by instead sampling these directly. To do this we can use Markov Chain Monte Carlo (MCMC) methods — which provide a Bayesian approach to estimating the distribution of model parameters through random sampling. Essentially, the models work by updating their prior beliefs on the distribution of model parameters as the sampler observes real data. Once the model converges around the ‘true’ distributions it samples directly from the posterior distribution of the model parameters. In the case of a Logistic Regression model, we model the target variable as a Bernoulli distribution.
There are potential drawbacks to using Bayesian Logistic Regression models, for example they can be sensitive to the priors that the model assumes, the prediction probabilities may not be well calibrated (depending on the prior assumptions) and, in the case of a hierarchical model, there may be ‘shrinkage’. Shrinkage occurs where hierarchy levels are pulled the mean of the parent level — in sports modelling the impact of this is that teams that are at the top and bottom of the table may have their estimates pulled up or down towards the mean of the table.
Figure 5 shows the violin plot for the estimated distribution of wins taken directly from the predictive posterior distribution. The distributions look a little more spread out than from our Logistic Regression, possibly indicating the higher spread of uncertainty in our model. Looking at the plot there may be some shrinkage as both Wales and Italy are expected to do better than in the Logistic Regression model, and Ireland appear to have less chance of a Grandslam.
We can use our samples to directly estimate the probability of a Grandslam by simply taking the number of Grandslams over the number of tournaments, this is shown in Figure 6.
We can then compare our model results to published odds. I found some odds published by a bet maker on January 1st that gave the following odds:
- No Winner 5/6 [this implies Any Winner odds of 6/5]
- Ireland 10/3
- France 9/2
- England 9/1
- Scotland 14/1
- Wales 500/1
- Italy 2000/1
We can convert the published odds to approximate probabilities using the below formula:
There are two things to consider here:
- Firstly, betting companies publish implied odds rather than true odds since they factor in a profit margin for the odds they publish (i.e. the house always wins)
- Secondly, odds change as new information becomes available. Our analysis is relatively simple and doesn’t factor in injuries or other factors. This is important since there have been notable injuries and withdrawals ahead of the start of the tournament so the odds will have changed. This is why I’m comparing the odds we’ve estimated to ones published at the start of the year where recent injuries won’t affect the published odds.
So how do our models compare to published odds? Our Frequentist model was surprisingly close, and our Bayesian model implied there was less certainty on the likelihood of a Six Nations Grandslam. In Table 2 you can see a comparison of the converted odds and our estimated probabilities
Overall, our estimates don’t look unreasonable despite the relatively small and sparse dataset we were using.
Our analysis found that:
- In the 2025 Six Nations France likely to end up punching above their weight given the relatively youthful side they’ve got
- Ireland look the most likely to get a Grandslam, but this is based on past performance. With a new coach, aging squad and changing of playmakers the outlook is less certain
- England’s True Odds are likely to be worse than their Implied Odds and based on past performance should aim for a strong mid-table position. They have one of the youngest squads but with more caps than other strong sides relative to their age profile. They have the potential to be disruptive in the tournament
- Scotland have a better chance of a Grandslam than England and are likely to be also competing for a strong mid-table position. They have the second oldest and most experienced team after Ireland and may be at or near their peak as a squad. Could it be now or never for this squad?
- Wales and Italy are unlikely to be high performers in the 2025 Six Nations, and Italy will be vying to finish above Wales for the second year running
- There is a reasonably strong chance of a Grandslam by any team, around a 30–40% chance
- This could be a very competitive tournament overall with many sides having a good chance of winning
In this article we’ve seen how we can leverage Frequentist and Bayesian methods to quantify uncertainty around the likely winners of the Six Nations in 2025. Whilst our models were relatively simple and constrained to using a small dataset our probabilities were not too dissimilar from published odds, though these have since changed as events have developed (injuries, call-ups, etc.).
Thank you for reading this article, I hope its been interesting. If you’re interested in learning more about the analysis you can find the full code on my GitHub account.