Neutralized and Converted Stats

First, what is not considered. There are no adjustments for:

  • Segregation
  • Changing role of bullpens
  • Improved travel or fitness or nutrition
  • Increased foreign talent
  • Field conditions or sizes
  • No consideration of changing tactics like the bunt or stolen base
  • or anything else not explicitly listed below.

What is adjusted?

We adjust all of a player's seasons from the park and league context of the seasons they played in into either a "neutral" setting (which is 100 park factor with 162-game season, 90% of runs earned, and 688 runs/team), or into a setting selected by the user with a particular year, league (with its runs/game and earned runs percentage) and home team (with its park factor). Originally, this was set at 750 as Bill James had done in his New Historical Baseball Abstract, but this is well over the historical average which is about 688 runs/162 games since 1900. So we've adjusted this downward to reflect this lower average run scoring. With the old number, most every offensive player improved, while with the new number, half get better and half get worse.

There is no change of role for a player, so playing time is essentially the same proportion as before, though things like At Bats go up as run scoring increases. This also means that Pedro Martinez will only make 30 starts a season even if you put him in the 1914 AL.

How are batters adjusted?

For each player/season, we have their stats (Including Runs Created--note that for neutralization the basic RC, without baserunning, GIDP, etc., is used as the player's baseline offensive level. This means that when you translate a player back into a context they actually played the RC numbers listed will be different from those listed under Special Batting for recent players.), and then the number of games their team played, their team's park factor, the league runs/game and if the DH was used that year. We handle trades in the proportional manner you would expect.

For the new setting we have the park factor, the number of games expected, and the run-scoring level. Games are adjusted proportionally and this creates a multiplier.

Runs created are adjusted using the ratio of park factors, ratio of run scoring, and a ratio determined by whether we are coming or going to a DH league.

As part of this conversion, we assume that a player will get singles, doubles, triples, home runs, walks, HBP, and SB in the same proportion to overall hits as before, so given our adjusted RC, we can then find a new level of hits for the player. (This requires using the quadratic formula, and you thought you'd never need it.)

The player's outs remain the same, so the new ABs is the new Hits plus the old outs.

We then compute new 2B, 3B, HR, BB, SB, and HBP.

We assume Runs and RBI are proportional to the change in RC, and since outs doesn't change neither will SO or SF.

And that is pretty much it for hitters.

How are pitchers adjusted?

The idea is pretty much the same except quite a bit more complicated in the details for pitchers. Here is a description by Justin Kubatko:

We start by making the assumption that a pitcher's innings pitched are a constant. This is imperfect as we can't necessarily assume that the pitcher will be used for more batters in a higher run environment. This means that a pitcher going from a high run scoring setting to a low run scoring setting will still pitch the same number of outs. Since batters facing pitcher data is spotty for some seasons, we estimate it using:
BFP = 3*IP + H + BB + HBP = 3*IP + TOB (times on base)

Since total bases allowed is not available, we estimate it using:
TB = (1.252*(H - HR) + 4*HR) (or a league specific factor)

Recall that Bill James's formula for basic runs created is:
RC = TB * (TOB) / (BFP)
(a) RC = TB * TOB / (3*IP + TOB)
We can set H = TOB * H_player/TOB_player = t * TOB and HR = H * HR_player/H_player = h*H = h*t*TOB.

We have to adjust the runs created allowed for league and park. First, for the park:
park = newPF / PF

Above, newPF is the park factor of the park we want to place the player in, and PF is the pitcher's park factor for that particular season. We'll assume that we're going to place him in a neutral park, so newPF is equal to 100:
park = 100 / PF

Now the league adjustment:
league = newLgR / LgR

Here, newLgR is team runs per 162 games for the league we want to place the player in, and LgR is team runs per 162 games for that particular season. We'll assume that we want to place the pitcher in an environment of 750 team runs per 162 games:
league = 750 / LgR

Now we make both adjustments to the player's runs created allowed:
newRC = RC * park * league

For example, here's Pedro Martinez in 2000:
BFP = 651 + 128 + 32 + 14 = 825
TB = 0.89*(1.255*(128 - 17) + 4*17 = 184.5
RC = 184.5*(128 + 32 + 14) / 825 = 38.91
park = 100 / 101 = 0.99
league = 750 / 857.9 = 0.874
newRC = 38.91*0.99*0.874 = 33.67
From here we use the RC formula (a) above and the substitutions for TB to get a quadratic equation in terms of TOB. We solve that for TOB and then we estimate the new H, BB, HR and HBP based off that #'s at the same proportion as before.

What are the most extreme settings?

Best for pitchers

| year | lg | Team                | lgRPG | BPF  | adjRPG |
+------+----+---------------------+-------+------+--------+
| 1907 | NL | Brooklyn Superbas   | 3.400 |   92 |   3.13 |
| 1908 | AL | Washington Senators | 3.445 |   91 |   3.14 |
| 1908 | NL | St. Louis Cardinals | 3.325 |   95 |   3.16 |
| 1968 | NL | Los Angeles Dodgers | 3.430 |   92 |   3.16 |
| 1908 | NL | Brooklyn Superbas   | 3.325 |   96 |   3.19 |
| 1909 | AL | St. Louis Browns    | 3.441 |   93 |   3.20 |
| 1968 | AL | Oakland Athletics   | 3.406 |   94 |   3.20 |
| 1916 | NL | Boston Braves       | 3.449 |   94 |   3.24 |
| 1916 | NL | New York Giants     | 3.449 |   94 |   3.24 |
| 1968 | AL | California Angels   | 3.406 |   95 |   3.24 |
| 1968 | AL | New York Yankees    | 3.406 |   95 |   3.24 |
| 1907 | NL | St. Louis Cardinals | 3.400 |   96 |   3.26 |
| 1908 | NL | Boston Doves        | 3.325 |   98 |   3.26 |
| 1908 | NL | Cincinnati Reds     | 3.325 |   98 |   3.26 |
| 1972 | AL | California Angels   | 3.467 |   94 |   3.26 |
| 1968 | AL | Washington Senators | 3.406 |   96 |   3.27 |
| 1906 | NL | Brooklyn Superbas   | 3.567 |   92 |   3.28 |
| 1972 | AL | Oakland Athletics   | 3.467 |   95 |   3.29 |
| 1909 | AL | Chicago White Sox   | 3.441 |   96 |   3.30 |
| 1909 | AL | Washington Senators | 3.441 |   96 |   3.30 |

Best for Hitters

| year | lg | Team                   | lgRPG | BPF  | adjRPG |
+------+----+------------------------+-------+------+--------+
| 1999 | NL | Colorado Rockies       | 5.004 |  126 |   6.31 |
| 2000 | NL | Colorado Rockies       | 5.004 |  125 |   6.26 |
| 1930 | NL | Philadelphia Phillies  | 5.684 |  107 |   6.08 |
| 1936 | AL | Boston Red Sox         | 5.671 |  106 |   6.01 |
| 1930 | NL | St. Louis Cardinals    | 5.684 |  105 |   5.97 |
| 1995 | NL | Colorado Rockies       | 4.632 |  128 |   5.93 |
| 1936 | AL | Chicago White Sox      | 5.671 |  103 |   5.84 |
| 1936 | AL | Cleveland Indians      | 5.671 |  103 |   5.84 |
| 1936 | AL | St. Louis Browns       | 5.671 |  102 |   5.78 |
| 1996 | NL | Colorado Rockies       | 4.684 |  123 |   5.76 |
| 1930 | NL | Chicago Cubs           | 5.684 |  101 |   5.74 |
| 1938 | AL | Detroit Tigers         | 5.368 |  107 |   5.74 |
| 2001 | NL | Colorado Rockies       | 4.701 |  122 |   5.74 |
| 1929 | NL | Philadelphia Phillies  | 5.364 |  106 |   5.69 |
| 1930 | NL | Brooklyn Robins        | 5.684 |  100 |   5.68 |
| 1930 | AL | Philadelphia Athletics | 5.414 |  105 |   5.68 |
| 1930 | NL | Pittsburgh Pirates     | 5.684 |  100 |   5.68 |
| 1936 | AL | Detroit Tigers         | 5.671 |  100 |   5.67 |
| 1996 | AL | Texas Rangers          | 5.388 |  105 |   5.66 |
| 1932 | AL | Cleveland Indians      | 5.233 |  108 |   5.65 |

Aaron with the 2000 Rockies has 1004 career home runs, and with the 1968 Dodgers has 654 home runs.

Cobb has a .429 career average playing for the 2000 Rockies.

Pedro with the 1968 Dodgers has a career ERA of 1.85 and with the 2000 Rockies a career ERA of 3.84.

Credits

Much of the reasoning in these conversions comes from Bill James's New Historical Baseball Abstract, p. 740-743.

Justin Kubatko designed the pitching methodology.