Mobile Site You Are Here >  >  > Similarity Scores

Len Kasper, Cubs play-by-play announcer:I will begin with the best baseball website every created, Baseball-Reference....

Similarity Scores

Similarity scores are not my concept. Bill James introduced them nearly 15 years ago, and I lifted his methodology from his book The Politics of Glory (p. 86-106). To compare one player to another, start at 1000 points and then you subtract points based on the statistical differences of each player.


To this there is a positional adjustment. Each position has a value, and you subtract the difference between the two players position. James just uses primary position, but I computed an average position for players who had more than one primary position. (See Ernie Banks)


Start with a thousand and then subtract the following deductions.

If they throw with a different hand and are starters subtract 10, relievers 25. For relievers you halve the winning percentage penalty. For all pitchers, the winning percentage penalty can be no larger than 1.5 times the wins and losses penalty. Relievers are defined as more relief appearances than starts and less than 4.00 innings per appearance.

I plugged all this into my database, to create the lists you see on the player pages. Note that a player must have 100 innings pitched or 500 at bats before being considered and to be truly accurate you need to look at whole careers, but it is fun to speculate all the same.

Age Based Similarity Scores

These values are computed in the exact same manner as the above manner. However, instead of comparing an active player's career to the entire career of retired players, we only compare the active player's career to the retired player's career when they were the same age as the active player. This gives more interesting lists for the active players because we get an idea of what path the player is taking.

This doesn't mean that Vladimir Guerrero was as valuable as Willie Mays over his first three seasons - just that their numbers are similar. The league's offensive levels and defensive value affect those measurements.

Age Path Similar Players

I've then gone through and for each season a player played computed who was the most similar player at that point in his career. I only have room to show the most similar player, but it can show players who peaked at early or late ages. Ruben Sierra comes to mind.