Beam Scoring: Montreal 2017 vs Doha 2018

by Brina Oct. 28, 2018

This time last year, there was one thing on everyone's mind: what in the world are the beam judges doing??

Okay, there might have been other things on people's minds. Iordache tore her Achilles, Uchimura hurt his ankle, everyone else hurt everything else, there was a literal hole in the Gymnova floor... but let's be real. The beam scores were the real story. Only one gymnast received an e-score above 8.0 the entire meet!

So in the lead-up to Doha, a lot of people were bracing themselves for beam scores to get slammed again. There seemed to be signs that the FIG was taking fluidity seriously on beam: for example, one of the example videos used to standardize judging emphasizes that deductions should be taken for pauses.

But in the past two days of qualification, we honestly haven't seen anywhere close to Montreal-level judging on beam. This really hit home for me when I saw Riley McCusker's score: she received a 13.1 after a fall. Last year, that score would have been just a tenth shy of qualifying her for the beam final.

So just how much looser has scoring been in Doha? I thought I would take a look at this more empirically, so I pulled the top 20 scores on beam from qualifications in Montreal and in Doha. Why just the top 20? I only want to look at gymnasts who performed well. I want to be sure that lower scores indicate stricter judging, not lots of falls.

Then, I pulled all the 2017 beam scores for the gymnasts in the top 20 in Montreal, and all the 2018 beam scores for the gymnasts in the top 20 in Doha. I took the mean of these scores for each gymnast in each year, so that we could compare each gymnast's average non-Worlds score to her Worlds score. However, these scores also include falls, etc. so I also took each gymnast's maximum non-Worlds score for comparison.

First, I compared the the average Worlds QF score for those 20 gymnasts in each year with the average of their mean scores at non-Worlds competitions.

YearWorlds QF ScoreMean Non-Worlds ScoreDifference

*** significant at p<0.01, ** significant at p<0.05, * significant at p<0.1

The average beam score in for the top beam workers in Montreal was significantly lower than the average score for those same gymnasts at all other meets that year - and I mean statistically significant, even though I'm only looking at 20 gymnasts. This is pretty surprising. In general, gymnasts try to peak at Worlds and they do better there then they've done all season. Plus, don't forget that we're only looking at hit world routines compared to all other routines.

The same cannot be said of Doha. Gymnasts actually scored slightly higher in Doha than they did at their average 2017, which is more in line with my expectations (see above). However, this difference isn't statistically significant.

I also wanted to try the same thing using gymnast's top scores at non-Worlds meets so that we're only comparing hit routines with hit routines.

YearWorlds QF ScoreMax Non-Worlds ScoreDifference

*** significant at p<0.01, ** significant at p<0.05, * significant at p<0.1

Of course, the Montreal scores are way lower than the gymnasts' top scores at other meets that year. However, the Doha scores have also been significantly lower than the same gymnast's top scores at other 2018 meets. So the scoring in Doha is still a little tighter than it has been elsewhere -- just not to the same extent that we saw in Montreal.

To get a better sense of the full distribution, I've made a k-density plot of the mean non-Worlds scores versus the Montreal Worlds Scores, as well as the maximum non-Worlds scores versus the Montreal Worlds scores. Without getting into the statistics, the k-density plot just shows us a nice smooth distribution of the scores for all 20 gymnasts.

It's immediately obvious that scores were way lower in Montreal than they were during the rest of 2017: you can see that the blue distribution is well to the left of the red distribution in both plots.

Here's the same thing for Doha. The difference is much less extreme.

So at the end of the day, Doha is not Montreal. Even if Simone hadn't named her kidney stone the Doha Pearl, even if Aliya hadn't made bars finals after having a baby, we still wouldn't be talking about the crazy beam scores.

But, domestic judges, take note: even this year, the best beam scores at Worlds are coming in below the best beam scores elsewhere. Wouldn't it be nice if we could all agree to score the same way?

