To see this working, head to your live site.

Edited: Jun 18, 2024

Subbuteo Ranking Concept - SubbutELO

Hi all,

There’s been lots of talk in the last few years about rankings in Subbuteo and their accuracy – whether it be FISTF, WASPA, or the English Championships, there’s always a discussion about their effectiveness. To be clear, I think all of these setups do good things – they each have their own goals and, although there are issues, it’s far better for them to exist than to not. This isn’t designed to replace any of them.

I wanted to see if there was another system that worked, specifically for the English circuit, for my two key aims – firstly, to see (regardless of lucky draws or number of events attended) if a ranking could reflect the pure skill of players; and secondly, to make the system relevant for all players rather than just the highest skill level. That’s where this comes in – the ELO ranking system.

The ELO ranking system isn’t a new concept – it’s currently being used by millions worldwide through chess, and FIFA even use it to rank every country in football (might be an interesting one to look at during the Euros!). What I’ve created is an adaptation of FIFA’s edition to apply to the Subbuteo dynamic.

HOW DOES IT WORK? The Basics

To start off, every player has a point score – this value represents their skill. Higher scores represent higher skills players – for this, the average score across all players is around 100, with new players being around 50, and the elite around 150.

This score will increase upon winning and decrease upon losing every match – the winner takes these points from the loser. However, the real benefit of this system lies within a single detail – the amount of points you gain or lose after each match. There are two key factors that go into this: the difference in skill between players, and goal difference. These will be explained properly in the “Details” section at the end of the article.

Each player will go into an event and play their games as normal – regardless of whether it’s a swiss system, groups, or otherwise. This system will simply look at the results post-event, calculate the points change for each player across their games, and player ranks will change. This rank will not reset after each event or season.

It sounds complicated, but for the players and organisers themselves it’s very straightforward – they just need to go and do events as normal.

Why bother with it? (Advantages)

This system isn’t perfect – none are. But it does provide certain perks that other ranking systems might not:

>Luck of the draw is currently seen as a huge aspect of outcomes within the championships, especially this year with such a high amount of players – this was perhaps the largest criticism received. However, turning to a standard group stage, whilst slightly more appropriate, still faces the issue of lucky/unlucky draws greatly affecting results. The ELO system, with its points adjusting based on your opponent in each match, greatly minimises the effect of getting an easy or difficult draw.

>Applicable to any tournament style. These rankings look at each match individually, rather than the overall result of any event. This means that whether Swiss, Groups, or otherwise is used to match players together, this ranking system will still work. This means that event organisers would not have to adhere to any specific format.

>Useful for all players. Many ranking systems are only relevant to those at the very top – however I would hope that this can be utilized by players of all skill levels.

>It gets more accurate as time goes on. Admittedly, my formula wasn’t perfected until very recently – you’ll notice a few strange outcomes in the upcoming results. However, from this point forward, everybody’s skill should be reflected quite clearly, especially after each new event played.

>Scores are not reset – a fair criticism of the current English Championships is that the event’s points reset each year, meaning that there is no long-term impact of these results. The ELO system is ongoing, so a single poor performance does not mean you should give up for the season.

>It can be used for seeding non-FISTF events.

>It’s systematic and mathematical. No bias.

In Practice: The current rankings

So, how about we put these ideas into practice? Behind the scenes for the last 18 months or so, I’ve been using this system with 18 major English events to see if it works. The system has changed and improved throughout this time, but the core idea has always been the same.

The 18 events include the 12 English Championship events (starting from January 2023) as well the following 6 events which were held in a professional context with at least 32 players: Pedmore (Sept 23), Leicester (Sept 23), Chasers (Oct 23), Pedmore (Dec 2023), Haverhill (April 24), and Andover (May 24). Notably, FISTF tournaments were not included – this is because those results would suddenly introduce a large number of foreign players who cannot be relied on to be a consistent part of this system.

I have attached an excel document that gives details of every event from September 2023 onwards, including the exact point changes for every player after every event. Filtering to only include those who have played at least 3 events since September, here is the current leaderboard:

I’ll not go on about these results too much – interpret as you will. Some players seem a bit out of place, but I trust that this will be adjusted in the new season very quickly.

Going forward

The next important question – okay, we’ve got this system, but what do we actually do with it? It’s something I’m not too sure on myself. For now, I’m going to keep updating it with results and refining it to be as accurate as possible, and will make a post showcasing this every once in a while. If anybody has ideas for how this can be utilised, please let me know – until then, it’ll be a more statistical version of Bookie Bob.

Aside from the six English Championship events, the criteria for events to be included in this system going forward will be: English event; hosts up to at least 24 or 32 players (Undecided); run using standard FISTF rules; held in at least a semi-professional context.

Players will only appear on the “proper” leaderboard when they have played at least 3 of these events in the past year – this is to ensure that only active players are promoted.

I will also slightly level out the rankings to give a soft-fresh start to the new season as of the start of the next applicable event.

Closing

That’s all – this is quite a long read, so I appreciate it if you had a look at any of this. If you could stick a comment on wherever you found the link to this to keep it active, that would be great.

I’m fully open to any constructive criticism or discussion about this, and would welcome it – I want this to be as approachable as possible for people.

I could make a future post that goes into more detail or answers questions, based on the feedback given – keep an eye out for that.

If you want to see how some of your result would shape up, you can test out this yourself with attached excel document. The “Calculator” tab will allow you to input a series of 5 results at a time. Change the red text for the information you’d like to see – note that this will only work on computers.

Details (Skip if you’re not bothered)

I’m going to include some miscellaneous details here that are more complicated, for those who are interested – feel free to skip this part if you’re not.

The first factor of point change, Skill Difference: This is crucial. Let’s look at three examples:

If a player beats an equally-skilled opponent, they will take a normal amount of points from the loser – let’s say 3.8 points for a 2-0 win.

If the two players have a huge difference in playing ability, and the better player wins, they will take a small amount of points from the loser, because this was an expected result – let’s say 1.1 points will be traded for a 2-0 win in favour of the better player.

Now, let’s say that the lesser-skilled player had a blinder, and beat their higher-skilled opponent. Because this was an unexpected and impressive win, this player will take far more points from the loser – let’s say 6.4 points for a 2-0 win.

Finally, if there is a draw, the lesser-skilled player will take a small amount of points from the winner – in this situation, they would earn 1.7 points from a draw. The point of this calculation is to make sure that players are more accurately rewarded for their achievements.

The second factor of point change, goal difference: More points are gained and lost with a higher difference of goals within the game: for example, a 5-0 will give the winner double the amount of points as a 1-0 win would. This is to incentivise not parking the bus after going 1 goal up. This system does not look at goals scored or anything – simply the difference in goals between players. This means that a 5-0 and a 7-2 are classed as the same thing, and there is no debate there. Any difference in goals above 5 does not earn extra points, so top players can’t go too far with beating lower-level players.

Breaking down the exact formula for the point change within each individual match: it’s quite complicated. The official Point Change Formula is: K G (Result – Expected).

>K is the volatility of point change – FIFA adjust this depending on how important the tournament is. This started off as ‘10’, switched to ‘7.8’ as of September 2023, and will be ‘5’ from now on.

>G is the Goal Difference in the match: A draw makes this 0; Winning by 1 goal = 1; 2 goals = 1.5; 3 goals = 1.75; 4 goals = 1.875; 5 goals = 2. This means that every additional goal difference still gives extra points, but less than the one before.

>“Result” is simple – Winning makes this ‘1’, Drawing = ‘0.5’, and Losing = ‘0’.

>“Expected” is… less simple. The exact formula is (1/(10^[-Rating Difference/40]+1)… good luck with that.

The biggest challenge that was only recently overcome was how to handle new players who did not yet have a rank. For this season, most of these players were set at the average point of 100. However, most of these players were of a lower ability, which meant that whoever was fortunate enough to be drawn in a group with them would get a large amount of points. This has since been adjusted for every event from now on, but it may explain some strange outcomes.

Thanks for reading – and if you start a flicking debate, I’ll take 10 points off of you.

Thanks,

Kye Arnold.

2 comments

2 Comments

Commenting on this post isn't available anymore. Contact the site owner for more info.

Yannis Karnezis

Jun 24, 2024

Hello @kyearnold,

at first, let me thank you for your initiative. Last August (2023) I've expressed the need for existence of such a rating system, as part of my proposal (please find attached in page 2, plus also mentions within other chapters) regarding a complementary coexistence of WASPA & FISTF in everyone's favour. However, due to lots of liabilities I couldn’t be able to kickstart anything and thankfully, not only you came up with this concept, but putting it into test was the most important part of all.

Especially, I liked the addition of Goal Difference which is a great mean to lead players reaching their top performance or even to exceed their “ceiling”. I have downloaded your spreadsheet, made some modifications within Calculator tab (during my spare time), and I’m returning it for your review. Please note the following:

Addition of extra, red-texted cells for input
Addition of cells’ background colours, plus gridlines were turned off to add some style, enhancing presentation
Expansion of matches up to ten (10), to cover even the largest of events
Correction of “Change” formula to allow for score input of over five (5) goal difference
The area on the left, until the “I” column is the User Interface. The rest of the columns may be hidden
Introduction of different K values, depending on players’ experience (number of tournament matches played). This should allow for quick adjustment of players’ rating near their true strength level, when starting out with a mean rating value.

As about new players entering the rating system, there is also the option of starting with no rating at all. The process of receiving their first ever calculated rating value should initiate only after achieving their first tournament win or draw against a rated opponent, and they may receive their rating at the end of the month, during which they would have completed five (5) tournament matches, in total, against rated opponents.

Personally, I agree with you as about trying out this system locally at first, so to apply any other necessary adjustments, but in my opinion your next step should be to upgrade it into a global one, by resetting all players’ values and covering all “official” FISTF and WASPA (of eg. 6+ participants, members of at least two different clubs) events. Of course, such a large scale effort would require collection of all (world-wide) results and wouldn’t be possible for a one-man-show. My understanding is that at least one representative/ volunteer in every country should update ratings based on locally held events (national & international).

Please also consider spreading the range of SubbutELO system by applying a starting rating value of 150 for a global restart, so that over time this will allow for better distinction between different levels of players (0-49 novice/ 50-99 beginners/ 100-149 intermediate/ 150-199 advanced/ 200-249 experts & 250+ masters-elites).

Thanks for your time. I hope you’ve found my feedback and suggestions useful enough.

Kind regards,

Yannis

kyearnold

Jul 18, 2024

Replying to

Hi Yannis,

I only just saw this comment, apologies for the late reply! Thank you for putting so much interest in this - it's great to see some shared interest.

I'm really impressed with your Co-op proposal, it is incredibly well thought-out and I can see the chess influence - it could be something to develop Subbuteo to the next level if people become receptive to it.

Also thank you for the upgrades to my calculator, it makes everything much more approachable for others and is a great framework to work with going forward.

For managing the K-Value, I agree about utilising it to handle new players in a more effective way: I am not currently doing this simply because of wanting to keep things simplified for others to fully grasp, but would be keen to adjust this according to your suggestion in the future.

As you know, such a complex system can be difficult to get people to agree with, so my aim for the 24-25 season is to keep going with the English circuit and seeing how everything develops. Using this on a grander scale is something I'd like to revisit at the end of the season when there is more evidence and recognition around the idea, with your input being greatly appreciated within that.

Once again thank you for the support on this, I really like the vision you have with the system and how it can be developed further - this is something I will be coming back to over time.

Subbuteo Ranking Concept - SubbutELO

© Jason Mitchell Photography