Score for Score

Posts tagged "Chart of the Day"

Today, I'm sharing an awesome chart by @mbt510 on Twitter. The chart shows the distribution of WAG execution scores by event at the 2019 World Championships.

I love this chart because because it quantifies trends that get tossed around all the time. In particular,

    Vault e-scores are way higher than any other e-scores.
    Beam e-scores are way lower than any other e-scores.
    Vault scores are far more tightly clustered than the scores on any other event are.

It's worth considering whether each of these trends reflects real differences in gymnasts' performances or structural issues with the Code of Points.

It's pretty clear that the high vault e-scores are due to the Code of Points. An entire vault routine is one skill that lasts about two seconds, meaning there's just fewer opportunities to deduct. Despite this fact, the individual deductions that can be taken on vault are the same magnitude as the individual deductions that can be taken on other events: a fall is a one-point deduction on every event, but there's only one chance to fall on vault. An amazing vault isn't actually more amazing than an amazing floor routine, even though it will almost certainly receive a higher e-score. The discrepancy is due to the design of the Code.

Beam e-scores, however might be lower because beam is just a riskier event than the others. While I haven't collected the data, I would guess that there are more falls on beam than on any other apparatus. That said, we can't forget that the Code provides more opportunities to take deductions on beam than it does on any other event. Many beam deductions, including those for pauses and balance checks, can be taken when the gymnast isn't even performing a skill.

Finally, we see far less separation between the best and the worst on vault than we do on beam. I'm not sure whether this is the Code's fault. An amazing vault and a crappy vault look a lot more similar than an amazing beam routine and a total splatfest. The way I think of it, a casual fan would find it much easier to distinguish good and bad beam routines than vaults. Even so, I would be curious to know whether gymnasts and coaches believe that the level of training and technique needed to elevate a bad routine to a good one is equal across all four events. Even though a good and bad vault may look sort of similar, if equal work is required to close the gap between good and bad on all events, then the improvement should be rewarded equally.

This chart forces us to ask a question: should the Code of Points be revised to even out e-scores on each events? I've spent a lot of time trying to imagine ways in which these discrepancies actually impact the results of a competition. Given that everyone competes under the same skewed system, such ways are rarer than you might think. It actually doesn't matter that vault e-scores are high if everyone is gets high vault scores.

However, we have to think carefully about the impact this system has on gymnasts with different strengths. For example, consider an two AAers who are each perfectly average on three events, but one specializes in vault and the other specializes in beam. In theory, they should be equally competitive in the all-around because they each have one great event. In practice, however, the beamer might have an advantage in an all-around competition because the difference between her beam routine and everyone else's gives her more points than the difference between the vaulter's vault and everyone else's. I haven't noticed this happening in practice, but these hypotheticals are what we need to consider.

One final note on the chart: the curves in this graphs don't literally measure how many gymnasts got each score on each event. Instead, each curve is normal distribution fitted to the mean and standard deviation of e-scores for each event. If you're interested in the raw counts for each event, you can find them here.

This chart raises more questions than it answers -- but all such questions are easier to answer when you're armed with the data.

Got a gym-relateded chart? Want to see it featured here? DM me on Twitter (@scoreforscore) or e-mail me at!

Tags: Chart of the Day

We hear a lot about how Simone Biles is the world’s most dominate athlete, but it’s really hard to quantify that. Today’s Chart of the Day displays one possible way to think about dominance: how big is a gymnast’s margin of victory?

This chart was created by The Medal Count (@TheMedalCount_), who gathered information on the margin of victory in World and Olympic competitions going back to - wait for it -1934! The data from 1950 onwards is displayed below.

Biles actually cannot lay claim to the largest margin of victory ever. That honor goes to Helena Rakoczy of Poland, whose score of 94.016 at the 1950 World Championships beat out the silver medalist's score by 2.316 points - albeit at a poorly attended competition under an anomalous scoring system that has never been used at any other worldwide competition.

The smallest margin of victory occurred in 1985, when Oksana Omelianchik and Elena Shushunova famously tied. The next smallest margin happened in 2005, when rounding conventions gave Chellsie Memmel an extra thousandth of a point over Nastia Liukin. While Rakoczy's victory may not live on in gymternet memory, these two competitions certainly do.

More generally, this figure shows that the margin of victory is largely a function of the era. Margins were relatively large in the early years, when compulsories and optionals were both counted for a maximum total score of 80 points. The larger total range allowed for a larger range of scores. By the late 1980s, judging trends had pushed scores upwards to the top of the range, and from 1989 onwards, only optional scores were counted. Perfect tens were being handed out like candy, and margins of victory fell to minuscule levels. The Code was updated to make perfect scores harder to achieve, leading to higher margins through the late 2000s. However, it's not until the 2006 introduction of the open-ended code of points that we begin to see true separation between the best and the rest.

In the modern era, Simone's 2016 Olympic win truly stands out: she beat out silver medalist Aly Raisman by a shocking 2.100 points. However, I would not say that this single performance is what makes her truly dominant. To my mind, dominance comes from consistent, repeated wins over time. How should we measure that? That's a longer discussion for another time.

If you’re interested in further discussion of these trends, as well as some even older data, be sure to check out The Medal Count’s original post!

Got a gym-relateded chart? Want to see it featured here? DM me on Twitter (@scoreforscore) or e-mail me at!

Tags: Chart of the Day

A quick Chart of the Day as we enter NCAA postseason!

The folks at College Gym News are running a bracket competition for the rest of the season. They've made the standings available in a spreadsheet online.

Just for fun, I took a look at which teams are going all the way in the brackets that have been submitted so far.

Does this measure each team's chances of winning? Maybe. More likely, it measures gymternet sentiment. But fun to look at either way!

Got a gym-relateded chart? Want to see it featured here? DM me on Twitter (@scoreforscore) or e-mail me at!

Tags: Chart of the Day

The NCAA season is upon us, so our latest Chart of the Day is all about scoring in college gymnastics. This awesome chart was put together by the creator of the NCAA Gym Stats blog (@ncaagymstats on Twitter).

NCAA scores have an even looser relationship with objectivity than elite scores have. One thing that makes this obvious is the patterns in NCAA scores over time. So many school records for “highest team score” or “gymnast with the most 10s” were set in the early 2000s, despite the fact that college gymnast just keep getting better and better. And that’s because early 2000s scores were off the charts.

They’re not off this chart though! (Sorry.)

This shows us the number of all-around scores higher than 39.7, 39.8, and 39.9 in each year. In 2004, almost seventy all-around performances were worthy of scores at or above a 39.7, and more than twenty (!!) were at or above 39.8.

This chart really shows how scores are currently creeping up towards that 2004 peak. There were more all around scores of 39.7 or higher in 2018 than there have been at any time since 2004. I’m sure that some of the proposed scoring changes this year seem to be in recognition of this fact— especially the much-discussed deduction for pauses on balance beam.

More generally, I’m interested in that up-and-down cycle you can clearly spot in the pink line on the chart above. Every five years or so, scores creep up for a bit, then back down for a bit, then back up. Now, I don’t follow conversations amongst coaches and judges closely enough to know exactly how and why this occurs. But I’m guessing there are usually a few competing considerations when we sit down and think about the ideal level of scoring in any given year.

In the short run, higher scores are good for the sport. Fans go crazy every time they see a ten, and breaking record after record looks great on paper. As long as gymnasts and teams are ranked correctly, no harm done. But in the long run, higher scores hurt the sport. The sixth ten at a meet is less exciting than the first. Big-name gymnasts realize that their best routines are rewarded just like their average ones, and it becomes obvious to everyone that all tens are not created equal. Judges lose the ability to rank correctly, and mutterings about how gymnastics isn’t a real sport start making a little more sense.

Like most of the gymternet, I’m strongly in favor of cracking down on existing deductions instead of simply adding new ones to the code. The problem of lax enforcement currently outweighs the problem of a screwy code of points - as evidenced by the massive gap between NCAA scores and JO scores despite the similarities in their codes.

I’m curious if anyone can give me a bit of a history lesson about what happened in 2004: exactly how did the scoring crackdown occur? Let me know in the comments!

Got a gym-related chart? Want to see it featured here? DM me on Twitter (@scoreforscore) or e-mail me at!

Tags: Chart of the Day

It's the second Score for Score chart of the day! This one is a bit out of date but worth featuring nevertheless. After the 2017 worlds, the next FIG newsletter featured some really interesting numbers about the competition. I've looked for something similar for 2018 with no luck - if anyone has found it, please send it my way!

There are a lot of interesting summary statistics in the report. They break down medals by region, and they show the average D and E score one event. (Spoiler: beam scores were low, vault scores were high. Shocking, I know.) It's really great that someone at the FIG is bothering to look at this sort of thing.

But my favorite is this figure showing the number of competitors by year of birth.

For me, this graph has some big takeaways for two groups of people.

To people who say that gymnasts are "little girls dancing": a full 46% of gymnasts competing at the 2017 World Championships were over the age of 18. And 27% were over the age of 20!

But, to people who say that gymnastics is for "grown-ass women": while some older gymnasts are certainly successful, 1 in 4 women at the 2017 World Championships was a first-year senior. While some gymnasts can make it the long run, it's a lot easier to get to the highest level when you're young.

I do wonder how this changes in each year of a quad. In the post-Olympic year, many more experienced gymnasts take breaks or retire. Those with the energy to make it to Worlds are usually the new seniors who didn't just put their bodies through the grueling Olympic process. It's totally possible that 2019 Worlds will have a very different age profile from the 2017 Worlds.

Also. Who does the charts for FIG? And can I have their job please? There are multiple graphs in this report with the wrong labels as well as more than a few misleading and/or hard-to-read design features. Dear lord.

Even this chart is sort of a case study in misleading data visualization. By treating the year of birth as a categorical variable instead of a continuous variable, the graph masks just how long the left-hand tail is. The distance on the x-axis from 2000 to 2001 is the same as the distance from 1975 to 1987. If I were making the chart, it would look something like this:

It tells a different story, right?

Anyway, I think that 2017 report was really cool and I hope someone is doing something similar this year.

Got a gym-related chart? Want to see it featured here? DM me on Twitter (@scoreforscore) or e-mail me at!

Tags: Chart of the Day