A further study of USAT scoring
Recently, USAT revamped their scoring site. When they did this, some things became easier but a few things were made more difficult. Overall, I think the change was positive and the reason why is that they published more I formation on how they calculate your race score.
In the “@details” column, they actually show the math and the par score which was used to calculate the race score. Since this is the first time they’ve done this, it’s really my first opportunity to verify that I understand the way they run their calculations. To test this, I picked a small local race. I did this for the following reasons: it was very small, so I had to do less digging for historical data, I already had a lot of the historical data, and it’s one of the first few races of the season, so I had hoped the data would be published quickly.
However, the local race director isn’t exactly always on top of things, so even though the race went off in January, the results were only published on USAT this week. I used the Shelbville Triathlon Race 1 for my comparison. In order to follow their protocol, I ranked all competitors in order of their finish. Then I looked up the previous year’s ranking for every single competitor (yes, it’s as painful as it sounds. That’s why it’s much easier for them to do this as they should have all the data in a database and they can cross reference it all while I have to look each one up by hand b/c they only publish it in pdf format.. ugh).
Regardless… I then strip out all the competitors without a ranking from the previous year (typically this means they didn’t have the requisite 3 races in the prior year in order to receive a score). Of those that are left, I strip off the top 20% and the bottom 20%. So, now I’m left with the theoretical middle of the pack which are called the “pace setters.” All the calculations now take place only on the pace setters. For each of them, I take this year’s time (in minutes) and multiply that by their ranking for last year and divide by 100. This will give the “par time” in minutes for each pace setter. Now, you average the par time of all the pace setters together and you come up with the overall par time.
For my example, Shelbyville Race 1, I had 114 participants. Only 40 of those had scores from last year. So, I stripped off the top 8 and the bottom 8. Then I used the scores for the middle 24 and did the calculations described above. The par time for my calculation turned out to be 30.7147. Using that, I was able to predict what the participants scores *should* be (if I did it correctly).
Since they’ve now released the data, I was able to double check my work. It turns out they calculated the par time as 30.3988. I have no idea how b/c they don’t give the details, but I can only guess that some of the people who I thought weren’t scored, probably were. When you look through the results, it’s hard to find Beth Whoever when her actual name per USAT is Elizabeth Whoever and that happens quite a few times from what I saw. I’m sure there were some people that I missed due to shortening or lengthening of names like that.
Either way, I predicted the top score to be a 91.868 for Jeremy Brown and my friend Bryan Wiemers to have a 84.342. B/c my par score was a bit higher than USAT’s, their actual scores were a bit lower. Jeremy scored a 90.909 and Bryan scored 83.461. My method wasn’t perfect, but it was pretty tedious. Unless I find a way to automate this process, I sure wouldn’t want to take this on for a race any larger than this one (which almost all of them are). But, I feel like I was able to verify that I do understand what they are doing finally and I’m glad they are more open in sharing a bit more information now.