FIDE Mathematician Proposes Changes To Improve Rating Accuracy

FIDE Mathematician Proposes Changes To Improve Rating Accuracy

[ad_1]

On Friday, the FIDE Qualification Commission and mathematician Jeff Sonas proposed several changes to the rating system aimed at counteracting deflation―so that FIDE ratings more accurately represent players’ strength.

The commission is opening a public discussion of the changes, encouraging members of the chess community to submit feedback to qualification@fide.com until September 30, 2023. The Qualification Commission will thoroughly review all feedback and present a final proposal to the FIDE Council in October 2023. Changes should be published in December 2023 and will come into effect starting January 2024.

Here is a summary of their proposal:

At a Glance

Goal:

Adjust the rating system so that a player’s rating more accurately represents their level of strength

Problem:

In the current system, players with lower ratings tend to be underrated, performing beyond Elo system expectations. This suggests that the current gap between players of different ratings is inaccurately large.

Proposed Solution:

  • Raise the minimum rating to 1400
  • Increase all ratings between 1000 and 2000 by 400-0 points (with less of an increase for higher ratings), keeping the order but compressing the spacing between players
  • Implement changes to the initial rating calculations to account for the rapid improvement of underrated new players 

Summary

Problem:

The original FIDE rating lists only included players above 2200 Elo. Since then, the minimum rating has been progressively lowered to allow for more and more players. The current minimum is 1000 Elo, set in 2013.

This created an unintended side effect: Many younger and newer players received very low initial FIDE ratings now at a very young age, during the stage of their career when their chess skill is improving most rapidly. As they competed, improved, and gained rating points, these underrated players have taken rating points from established players, sparking an overall deflation. 

Hundreds of thousands of new lower players have joined the FIDE rating lists since 2013. In fact, the active rating pool has doubled. Because new underrated players continue to join the FIDE rating lists every month, the deflation keeps growing.

In search of concrete evidence of the problem, Sonas compared how 1500-rated players performed vs. 2100 players in past years vs. today. Using results from tens of thousands of games, he found that the performances of the 1500-rated group have nearly doubled since 2012. This suggests that a 600-point rating gap today isn’t as large a difference in performance as it was 10 years ago.

Graphic: Jeff Sonas/FIDE.

Then Sonas zooms out, taking the results of millions of games and charting how rating differences of all sizes have performed during different years. The graph below shows a consistent pattern: High ratings have progressively performed worse against lower-rated opponents from 2013 to today. For example, players 400 points higher rated than their opponents scored 87% in 2008-2012, while they scored 79% in 2021-2023.

The black curved line is the Elo prediction for that rating gap, while the colorful lines represent actual results from different years. The further the colorful lines are from the black line, the more points the higher-rated players lose. 

The white line represents the most recent results. It has the most shallow curve of all, suggesting that the rate of decline is accelerating―showing a visual representation of the growing deflation problem. 

Graphic: Jeff Sonas/FIDE.

Sonas analyzed actual results compared to ratings in several ways in his full report and came to the conclusion that the differences in playing strength for players from 1000 to 2400 Elo (99% of FIDE players) only span 1000 Elo points, not 1400. This led to his suggestion to compress the rating system:

“Honestly, I see no feasible course of action other than manually compressing the lower part of the rating list so ratings are not so spread apart, along with other measures to keep all this from happening again.”

“The Compression and the Calculation Improvements are an attempt to bring ratings into line with players’ evident strengths.”

Proposed Changes:

Step 1: Compression

On the January 2024 rating list, all players below 2000 would be given a one-time rating increase of 0 to 400 points. The points added will gradually decrease from 400 (for those rated 1000) to 0 for those rated 2000. The higher the starting rating, the incrementally lower the increase. 

The goal is to take the ratings from 1000-2000 and compress them into 1400-2000 without affecting the order of the players, only the spacing between them. Also, it would make all ratings above 1400. As Sonas explained:

“There would be the same 346,000 players rated below 2000 Elo both before and after; it’s just that they would now span a range of 600 Elo points rather than 1,000 Elo points.”

Sonas created a chart to show what this would look like:

Graphic: Jeff Sonas/FIDE.

Step 2: Calculation Improvements

The next phase of changes would be in effect on the February 2024 rating list:

  • Raise the minimum rating from 1000 Elo up to 1400 Elo
  • Restore the 400-point rule back to its earlier state so that it can apply multiple times in an event

The 400-point rule states that when the rating difference between two players exceeds 400 points, it is treated as 400 points. This protects higher-rated players from losing as many points from one upset. 

  • All initial ratings would be calculated:

    • directly from the player’s performance rating (with a maximum of 2200 Elo). 
    • with two additional draws against hypothetical 1800-rated opponents.

  • All unrated players would get a fresh start: No previous performances would be counted. 

To estimate how these changes could affect ratings over time, Sonas created a program to simulate the FIDE rating system and test out the results of different regulations. In a simulation using actual results from 2017-2019, his proposed changes brought player performances nearly completely aligned with their Elo predictions (See the black and blue curved lines in the graph below).  

Graphic: Jeff Sonas/FIDE.

Further Questions

1. What does this mean for players over 2000?

Sonas theorizes that this rating change would increase ratings across the board. Even ratings over 2000 that are not directly affected immediately would be positively affected in time. When 2000+ players face lower-rated players in the future, they wouldn’t lose as many points from inaccurately low ratings.

His analysis shows that there’s been virtually no growth in the number of players over 2100 in the last six years, even though 200,000 new players have entered the rating pool. With so many new players, it stands to reason that over time some of these players would reach master strength, and the number of players above 2100 would increase. 

By implementing these changes, he predicts and has successfully simulated modest organic growth in the number of players achieving master-level ratings over time. 

He further explains:

“These changes would likely lead to a modest increase in the number of titled players in upcoming years, but these titles would be well-deserved, again because the ratings are (hopefully) being brought more into line with players’ demonstrated playing strengths. The intention is to reverse a decade’s worth of deflation, rather than artificially increasing anyone’s rating or granting undeserved titles.”

The intention is to reverse a decade’s worth of deflation, rather than artificially increasing anyone’s rating…

―Jeff Sonas

2. Why are two hypothetical draws vs. 1800s being added to initial rating calculations?

The data also suggests that there is unrealistic variability in initial ratings. Players who receive relatively low initial ratings tend to start immediately outperforming them, and players who receive relatively high initial ratings tend to immediately underperform them.

The two hypothetical draws (50% results) against 1800s (the rating right in the middle of 1400-2000, the new rating range of the affected players) would stabilize players’ initial ratings toward the average.

He also hopes this mild inflationary factor will counteract the effect of young, underrated, rapidly-improving juniors:

“Drawing against an 1800-rated opponent is likely a slight overperformance for most unrated players, and so these additional draws will “bake in” a small amount of planned improvement in the initial ratings of new players, so that their expected and natural improvement will not have quite so much impact upon the established rating pool as is the case today.”

Sonas’ full proposal is available here

What are your thoughts on the proposed changes? Let us know in the comment section below.

[ad_2]

Source link

Tinggalkan Balasan