Score for Score

Who's Inconsistent? Reputation vs. Statistics

Who's Inconsistent? Reputation vs. Statistics

by Brina Aug. 8, 2019

As we draw closer to Worlds team selection, one question is being asked more and more: "Is she consistent?"

I've spent a lot of time thinking about how to measure consistency in gymnastics, and I've developed a metric for doing so. You'll see these consistency stats at the top of the page for every gymnast in my database who has enough information.

You can find the details behind this metric here, but there are two things that are important to understand. First, lower consistency stats are better; higher stats indicate a more inconsistent gymnast. Second, the consistency stats attempt to measure how well a gymnast hits in competition. That means we're looking at execution deductions and neutral deductions (out of bounds, over time, etc.) and not difficulty.

Below, I've pulled the current all-around consistency stats for ten top US elites. These draw upon all the scores in my database from the past year for which I have d-score information.

GymnastCountryConsistency Stat
Aleah FinneganUSA0.499
Grace McCallumUSA0.146
Jade CareyUSA0.207
Jordan ChilesUSA0.319
Kara EakerUSA0.344
Leanne WongUSA0.109
Morgan HurdUSA0.203
Riley McCuskerUSA0.327
Simone BilesUSA0.356
Sunisa LeeUSA0.250

But with so much conversation around consistency ahead of team selection, I wanted to understand how these data-driven consistency metrics compare to the talk going 'round the gymternet. To do this, I shared a survey on Twitter asking fans to mark which gymnasts they think have a reputation for being inconsistent. The respondents are by no means a representative sample of any population -- but they are awesome group of 173 people who helped make this post possible.

The chart below shows the gymnasts' inconsistency reputation rating: of those respondents who answered yes or no, how many thought, "Yes, this gymnast has a reputation for being inconsistent." For the raw survey results and more details on the survey methodology, see this companion post.

Already, we can see some differences between the Score for Score consistency stats and the gymternet's inconsistency reputation rating. To visualize the differences, I plotted both metrics on the same chart.

As we'd expect, gymnasts with a higher consistency stat tend to have a greater reputation for inconsistency amongst the gymternet. The correlation is 0.38, indicating that 38% of the variation in gymnasts' inconsistency reputation ratings can be explained by the variation in their consistency stats.

The correlation is lower than I expected, largely due to two gymnasts. In particular, practically no one would ever consider Simone Biles an inconsistent gymnast. And she's not inconsistent - in terms of her win record. However, when we drill down and look at her execution scores and neutral deductions, her results are sort of all over the map. And she might have ended up atop the podium, but she did have a very off day at Worlds last year. It all shows up in her scores.

Aleah Finnegan is also statistically less consistent than her reputation would suggest. I'm guessing that that's largely because she's new on the elite scene and many fans haven't been following her performances much until recently. Plus, there's probably a "my sister is Sarah Finnegan" bump when it comes to having a reputation for consistent hits.

Of course, the reputations might be capturing important information that the scores themselves leave out. Which gymnasts came through in competition but looked scary in podium training? Which ones weren't put up in the team final because they weren't hitting in practice? The scores don't directly reflect any of this information.

But as we look towards the Worlds team selection, it's worth considering the numbers when we talk about a gymnast's consistency. You can't always believe what you read on Twitter.

Tags: Fun with Score Data, Survey Says