This is our old blog. It hasn't been active since 2011. Please see the link above for our current blog or click the logo above to see all of the great data and content on this site.

2011 Sim Scores Updated

Posted by Sean Forman on October 5, 2011

I've run the similarity scores for every mlb player to include the 2011 season. Enjoy.

9 Responses to “2011 Sim Scores Updated”

  1. Matt L. Says:

    Would it be possible to develop post-season sim scores? Who is most like Reggie Jackson in post season performance? Or Bob Gibson?

  2. kds Says:

    Sean, thanks for doing this so quickly. I've noticed some anomalous results with the OPS+ numbers on the sim score pages. For example; I went to Albert Pujols through age 31 and then looked at the age 32 to end of career figures for his best sims from the first list. At the bottom of the page there are lines for average of all 10 and average for all who are retired. (In this case 9, since only Man-Ram played in 2011.) Manny's OPS+ for age 32 season to end of career is listed as 150. The average including Ramirez is shown as 144. The average w/o him is 145!?! This not the only strange OPS+ I've seen on the sim pages over the years. These were the very first complete through 2011 sims I looked at.

  3. Andrew Says:

    Sean, I've got an unrelated question. When you calculate oWAR and dWAR, why do you lump in positional value (Rpos) under oWAR? It seems to me that it should go under dWAR, although I guess it can be argued both ways, since offensive production is more valuable when coming from a rarer defensive position.

    I guess my point is that while Babe Ruth is the all-time oWAR leader with 164.6, Brooks Robinson is the all-time dWAR leader with only 27.3. Do we really believe that a player's offensive contributions are worth six times as much as their defensive contributions (and yes, I realize that my method for determining that factor was extremely unscientific)? I feel that switching Rpos from the oWAR column to the dWAR column would do something towards lessening that gap that makes a little bit of sense.

    Just by eyeballing it, Brooks' 69 career positional runs give him another 6.9 dWAR, for a total of 34.2. Meanwhile, since Babe had negative positional value, he actually gains 8.1 oWAR to end up with 172.7. Overall WAR values are unaffected, of course.

    Another value which goes into oWAR that doesn't seem to be entirely offensive is Rrep, which are the runs between an average player and a replacement player. For a full season of playing time, a player usually gets 22 or 23 runs - hence why we say an average player is usually worth about 2 WAR. However, some of that average player's value would probably come from playing average defense. Even if only 1/7 of all total value is defensive (as implied by the rough calculation above), that's about 3 runs per season of defensive value that we lump under oWAR. Seems counterintuitive to me.

  4. zuke Says:

    has there ever been an effort to use advanced statistics to measure similarity? using mostly counting stats seems to fly in the face of advanced metrics.

  5. Johnny Twisto Says:

    Andrew/3, the reason it is divided like that is so people who don't like the Total Zone defensive numbers can easily replace it with a different fielding stat.

    I 100% agree with you that the nomenclature is faulty, and leads to more confusion, since the position one plays is part of his defensive value.

    It's been discussed here before. Sean is aware of your (our) POV.

  6. Dvd Avins Says:

    Hanley Ramirez' top two comps through age 27 are the two halves of a double play combo. Can you guess who they are?

  7. Whiz Says:


    It's not advanced statistics, but I have made a similarity score (called the Rate Similarity Score, or RSS) based on rates per PA for 1B, 2B, 3B, HR, BB, and HBP, and SB and CS divided by 1B+BB+HBP (the latter being a rough measure of steal chances). Someone with 1000 PA could then be a very similar hitter to someone with 9000 PA using RSS if they had similar BA, HR rates, etc.

    For players with at least 5000 PA, the two most similar hitters were Harold Baines and Richie Zisk. Pete Rose's five most similar players using RSS were Billy Goodman, Joe Sewell, Kevin Seitzer, Woody English and Dom DiMaggio! None of them show up in his top ten similarity list based on counting stats.

    For more on RSS, see this article

  8. Whiz Says:

    Oops, the link didn't take. Try here.

  9. DavidRF Says:

    Ironically, Bill James intentionally avoided advanced metrics for these similarity scores. He introduced them for his HOF book twenty years ago and he trying to incorporate what he thought the biases of HOF voters were at the time. No park adjustments or era adjustments and a heavy emphasis on traditional counting stats because of that.

    I still like them as a "toy", though. The false matches are entertaining and are easy to spot by looking at the OPS+ column in the table. A WAR column would also be illuminating, but they don't have that yet.