DragonMaster ELO Ranking System Whitepaper
1. ELO
The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess. It is named after its creator Arpad Elo, a Hungarian-American physics professor.
The Elo system was invented as an improved chess-rating system over the previously used Harkness system, but is also used as a rating system in association football, American football, baseball, basketball, pool, table tennis, and various board games and esports.
The difference in the ratings between two players serves as a predictor of the outcome of a match. Two players with equal ratings who play against each other are expected to score an equal number of wins. A player whose rating is 100 points greater than their opponent’s is expected to score 64%; if the difference is 200 points, then the expected score for the stronger player is 76%.
A player’s Elo rating is represented by a number which may change according to the outcome of rated games played. After every game, the winning player takes points from the losing one. The difference between the ratings of the winner and loser determines the total number of points gained or lost after a game. If the higher-rated player wins, then only a few rating points will be taken from the lower-rated player. However, if the lower-rated player scores an upset win, many rating points will be transferred. The lower-rated player will also gain a few points from the higher rated player in the event of a draw. This means that this rating system is self-correcting. Players whose ratings are too low or too high should, in the long run, do better or worse correspondingly than the rating system predicts and thus gain or lose rating points until the ratings reflect their true playing strength.
Elo ratings are comparative only, and are valid only within the rating pool in which they were calculated, rather than being an absolute measure of a player’s strength.
2. Formulas and examples
Performance rating or special rating is a hypothetical rating that would result from the games of a single event only. Some chess organizations use the “algorithm of 400” to calculate performance rating. According to this algorithm, performance rating for an event is calculated in the following way:
Let RA and RB be the ratings of players A and B, respectively.
The expected win rate for player A is calculated as follows:
And the expected win rate for player B is calculated as follows:
EA and EB can be interpreted as the probability that players A and B will win the match, respectively.
For example, if there’s a match between a player A with an Elo rating of 1500 points and a player B with an Elo rating of 1600 points.
Then the EA will be calculated as follows:
And the EB will be calculated as follows:
These results indicate that player A has a 36% chance to win against player B, while player B has a 64% expected chance to win against player A.
We note that EA + EB = 1, since the match is between two players. An ideal match would be one in which the technical differences between the two players are minimal, or in other words, a match in which EA ≈ EB, which is only possible when RA ≈ RB.
Let player A be the challenger (from their perspective). Based on player A’s win, or loss, Result is recorded as 1 or 0.
Player A’s adjusted Elo rating after the match is calculated as follows:
R’A = RA + K *(Result-EA)
Player B’s adjusted Elo rating after the match is calculated as follows:
R’B = RB + K *(1-Result-EB)
K is known as the K-factor or development coefficient, and it is the maximum absolute amount that a player’s score can change in a single Elo update. In other words, the higher the K-factor, the less stable a player’s Elo rating becomes. To know more about K-factor, please refer to the concept of K-factor in the next chapter.
In this case, we will use the K-factor of the challenger player, player A, when calculating the adjusted ratings for both players.
If there’s another match between a player A with an Elo rating of 1500 points and a player B with an Elo rating of 1600 points. Assume player A challenges player B, and their K-factor is 20. In the previous calculation, EA ≈ 0.360 and EB ≈ 0.640.
Then there would be two possible outcomes:
If player A wins, then Result = 1.
Player A’s new Elo is ≈ 1500+20*(1–0.360) = 1512.8, while player B’s new Elo is ≈ 1600+20*(1–1–0.640)=1587.2.
If player A loses, then Result=0.
Player A’s new Elo is ≈ 1500+20*(0–0.360) = 1492.8, while player B’s new Elo is ≈ 1600+20*(1–0–0.640)=1607.2.
Since we use the same K-factor to update both players, and since EA+EB = 1, the points gained by the winner are exactly equal to the points lost by the loser.
Another indication is that the player with the higher Elo score has more to lose; a higher-rated player gains fewer points for a win but loses more points for a loss, compared to a lower-rated player.
This is because the higher-rated player is expected to have more chance of winning (in this case, EB is higher than EA), so they gain fewer points for a win compared to a lower-rated player.
3. K-factor
The method for determining a player’s K-factor is based on the rating system used by the International Chess Federation (FIDE). The system is dynamic, and its standards are as follows:
- K=40 → This situation suits players who have played fewer than 30 games. More changes are expected to accelerate the discovery of Elo ratings. A higher initial K-factor adds extra variability and ensures that new players can quickly enter the range appropriate for their skill level.
- K=20 →This situation suits players who have played 30 or more games and have a rating below 2400. More experienced players are considered to be within the range appropriate for their skill level; less variability helps to more accurately match players.
- K=10 → This situation suits players who have played 30 or more games and have a rating of 2400 or higher. Even if a player’s Elo falls below 2400, the K-factor remains 10. Highly experienced and skilled players do not need as much rating change to obtain an accurate rating.
4. Lookup table
p: Expected probability of winning
dp: Difference in rating between players
When the K-factor is plugged into the ELO formula, the result is shown in the following table:
https://drive.google.com/file/d/1VFx1QM2r61O1LR2Jc6bC2lVNNYZKt1q0/view
5. Matching
According to the lookup table, in order to avoid situations where players earn 0 points due to a large difference in rating (dp) resulting in a win or loss, the red area represents invalid matches and the green area represents valid matches. Therefore, the matching range is [-512 ~ 512].
Beginner Players
To better protect beginner players, K=40 players can only be matched with K=40 players, and a mixed mode of PVP and PVE is supported.
Experienced players
When the player’s K=20 or 10, it means that the player has passed the beginner phase , the next goal will be to sprint for a higher ELO score, and only PVP mode is supported.
Example:
Player A has K=40 , 500 points: The valid matching range is 0 ~ 1012, only for players with K=40.
Player B has K=20 , 2000 points: The valid matching range is 1488 ~ 2512, for players with K=20 and K=10.
Note:
- There are no limits on the number of times a player can challenge the ranking mode each day.
- When there are multiple players eligible for matching at the same time, priority is given to matching players with similar ratings.