Neutralized and Converted Stats
First, what is not considered. There are no adjustments for:
- Changing role of bullpens
- Improved travel or fitness or nutrition
- Increased foreign talent
- Field conditions or sizes
- No consideration of changing tactics like the bunt or stolen base
- or anything else not explicitly listed below.
What is adjusted?
We adjust all of a player's seasons from the park and league context of the seasons they played in into either a "neutral" setting (which is 100 park factor with 162-game season, 90% of runs earned, and 715 runs/team), or into a setting selected by the user with a particular year, league (with its runs/game and earned runs percentage) and home team (with its park factor). Originally, this was set at 750 as Bill James had done in his New Historical Baseball Abstract, but this is well over the historical average which is about 715 runs/162 games since 1900. So we've adjusted this downward to reflect this lower average run scoring. With the old number, most every offensive player improved, while with the new number, half get better and half get worse. We've still provided a link to 750 if you like that better.
There is no change of role for a player, so playing time is essentially the same proportion as before, though things like At Bats go up as run scoring increases. This also means that Pedro Martinez will only make 30 starts a season even if you put him in the 1914 AL.
How are batters adjusted?
For each player/season, we have their stats (Including Runs Created--note that for neutralization the basic RC, without baserunning, GIDP, etc., is used as the player's baseline offensive level. This means that when you translate a player back into a context they actually played the RC numbers listed will be different from those listed under Special Batting for recent players.), and then the number of games their team played, their team's park factor, the league runs/game and if the DH was used that year. We handle trades in the proportional manner you would expect.
For the new setting we have the park factor, the number of games expected, and the run-scoring level. Games are adjusted proportionally and this creates a multiplier.
Runs created are adjusted using the ratio of park factors, ratio of run scoring, and a ratio determined by whether we are coming or going to a DH league.
As part of this conversion, we assume that a player will get singles, doubles, triples, home runs, walks, HBP, and SB in the same proportion to overall hits as before, so given our adjusted RC, we can then find a new level of hits for the player. (This requires using the quadratic formula, and you thought you'd never need it.)
The player's outs remain the same, so the new ABs is the new Hits plus the old outs.
We then compute new 2B, 3B, HR, BB, SB, and HBP.
We assume Runs and RBI are proportional to the change in RC, and since outs doesn't change neither will SO or SF.
And that is pretty much it for hitters.
How are pitchers adjusted?
The idea is pretty much the same except quite a bit more complicated in the details for pitchers. Here is a description by Justin Kubatko:
We start by making the assumption that a pitcher's innings pitched
are a constant. This is imperfect as we can't necessarily assume that
the pitcher will be used for more batters in a higher run
environment. This means that a pitcher going from a high run scoring
setting to a low run scoring setting will still pitch the same number
of outs. Since batters facing pitcher data
is spotty for some seasons, we estimate it using:
BFP = 3*IP + H + BB + HBP = 3*IP + TOB (times on base)
Since total bases allowed is not available, we estimate it using:
TB = (1.252*(H - HR) + 4*HR) (or a league specific factor)
Recall that Bill James's formula for basic runs created is:
RC = TB * (TOB) / (BFP)
(a) RC = TB * TOB / (3*IP + TOB)
We can set H = TOB * H_player/TOB_player = t * TOB and HR = H * HR_player/H_player = h*H = h*t*TOB.
We have to adjust the runs created allowed for league and park.
First, for the park:
park = newPF / PF
Above, newPF is the park factor of the park we want to place the
player in, and PF is the pitcher's park factor for that particular
season. We'll assume that we're going to place him in a neutral park, so
newPF is equal to 100:
park = 100 / PF
Now the league adjustment:
league = newLgR / LgR
Here, newLgR is team runs per 162 games for the league we want to
place the player in, and LgR is team runs per 162 games for that
particular season. We'll assume that we want to place the pitcher in
an environment of 750 team runs per 162 games:
league = 750 / LgR
Now we make both adjustments to the player's runs created allowed:
newRC = RC * park * league
For example, here's Pedro Martinez in 2000:
BFP = 651 + 128 + 32 + 14 = 825
TB = 0.89*(1.255*(128 - 17) + 4*17 = 184.5
RC = 184.5*(128 + 32 + 14) / 825 = 38.91
park = 100 / 101 = 0.99
league = 750 / 857.9 = 0.874
newRC = 38.91*0.99*0.874 = 33.67
From here we use the RC formula (a) above and the substitutions for TB to get a quadratic equation in terms of TOB. We solve that for TOB and then we estimate the new H, BB, HR and HBP based off that #'s at the same proportion as before.
What are the most extreme settings?
Best for pitchers | year | lg | Team | lgRPG | BPF | adjRPG | +------+----+---------------------+-------+------+--------+ | 1907 | NL | Brooklyn Superbas | 3.399 | 92 | 3.13 | | 1908 | NL | St. Louis Cardinals | 3.327 | 95 | 3.16 | | 1968 | NL | Los Angeles Dodgers | 3.430 | 92 | 3.16 | | 1908 | AL | Washington Senators | 3.444 | 92 | 3.17 | | 1908 | NL | Brooklyn Superbas | 3.327 | 96 | 3.19 | | 1909 | AL | St. Louis Browns | 3.439 | 93 | 3.20 | | 1908 | NL | Cincinnati Reds | 3.327 | 97 | 3.23 | | 1968 | AL | Oakland Athletics | 3.406 | 95 | 3.24 | | 1908 | NL | Boston Doves | 3.327 | 98 | 3.26 | | 1972 | AL | California Angels | 3.467 | 94 | 3.26 | | 1909 | AL | Washington Senators | 3.439 | 95 | 3.27 | | 1968 | AL | California Angels | 3.406 | 96 | 3.27 | | 1906 | NL | Brooklyn Superbas | 3.567 | 92 | 3.28 | | 1916 | NL | New York Giants | 3.449 | 95 | 3.28 | | 1972 | AL | Oakland Athletics | 3.467 | 95 | 3.29 | | 1909 | AL | Chicago White Sox | 3.439 | 96 | 3.30 | | 1968 | AL | Washington Senators | 3.406 | 97 | 3.30 | | 1968 | AL | New York Yankees | 3.406 | 97 | 3.30 | | 1907 | NL | St. Louis Cardinals | 3.399 | 97 | 3.30 | | 1917 | NL | Boston Braves | 3.526 | 94 | 3.31 | Best for Hitters | year | lg | Team | lgRPG | BPF | adjRPG | +------+----+-----------------------+-------+------+--------+ | 2000 | NL | Colorado Rockies | 5.004 | 131 | 6.56 | | 1999 | NL | Colorado Rockies | 5.004 | 129 | 6.46 | | 1930 | NL | Philadelphia Phillies | 5.684 | 107 | 6.08 | | 1996 | NL | Colorado Rockies | 4.684 | 129 | 6.04 | | 1936 | AL | Boston Red Sox | 5.671 | 106 | 6.01 | | 1995 | NL | Colorado Rockies | 4.632 | 128 | 5.93 | | 1936 | AL | Chicago White Sox | 5.671 | 104 | 5.90 | | 1930 | NL | St. Louis Cardinals | 5.684 | 103 | 5.85 | | 1936 | AL | St. Louis Browns | 5.671 | 103 | 5.84 | | 1930 | NL | Chicago Cubs | 5.684 | 101 | 5.74 | | 1929 | NL | Philadelphia Phillies | 5.364 | 107 | 5.74 | | 2001 | NL | Colorado Rockies | 4.701 | 122 | 5.74 | | 1936 | AL | Cleveland Indians | 5.671 | 101 | 5.73 | | 2000 | AL | Minnesota Twins | 5.296 | 108 | 5.72 | | 1996 | AL | Toronto Blue Jays | 5.387 | 106 | 5.71 | | 1938 | AL | Detroit Tigers | 5.368 | 106 | 5.69 | | 1930 | NL | Pittsburgh Pirates | 5.684 | 100 | 5.68 | | 1936 | AL | Detroit Tigers | 5.671 | 100 | 5.67 | | 1996 | AL | Milwaukee Brewers | 5.387 | 105 | 5.66 | | 1997 | NL | Colorado Rockies | 4.603 | 123 | 5.66 |
Aaron with the 2000 Rockies has 1030 career home runs, and with the 1968 Dodgers has 653 home runs.
Cobb has a .436 career average playing for the Rockies.
Pedro with the 1968 Dodgers has a career ERA of 1.75 and with the 2000 Rockies a career ERA of 3.98.
Much of the reasoning in these conversions comes from Bill James's New Historical Baseball Abstract, p. 740-743.
Justin Kubatko of Basketball-Reference.com designed the pitching methodology and did almost all of the programming.