If you had to pick one number over the history of baseball to convert runs into wins, it would 10 runs per win. Across 140 seasons, every ten runs a player adds or subtracts adds about one win, so if we were doing this by hand, we would use 10 runs per win and be done with it. We are using computers, though, so we can compute more exact values for every player and season.
With update 2.1, we are now on our third method for converting runs to wins. Version 1.0, used a R/W estimation based on the league's runs per game. With Version 2.0, we introduced a variable runs to wins estimation which varied with the leagues runs per game and the player's runs added or subtracted from average per game. This was an appropriate step, but the approximation we were using broke down for extreme players like 1985 Gooden, 2009 Greinke or 1999-2004 Bonds. I noticed this issue, but didn't quite followthrough on realizing it was a problem until after posting the numbers. Version 2.1, is going to fix this issue, but it complicates the calculation considerably, so we've split the explanation onto a separate page.
The league context obviously plays a major role in how much runs translate into a win. A typical Babe Ruth season is going to win you more games in 1912 than it would in 1933. Teams scored many fewer runs in 1912, so the value of each run is heightened.
The second effect is smaller, but still important. Each Babe Ruth run added has less value than a Wally Pipp run if both are placed into a league average context. Ruth, because he is so good, is going to produce some excess, unneeded runs. So since more of his runs are unneeded to produce wins, his runs to wins calculation will be higher than Pipp's. Ruth still produces way more wins than Pipp, but not as many as if we used a constant run estimation.
In the same way, dominant pitchers lower the run environment, so their runs saved become more valuable. Bob Gibson's runs saved will translate into more wins than the runs saved by a league average pitcher like Ken Johnson.
The runs to wins estimators are related to the pythagorean formula for baseball.
Team Pythagorean W-L% = (RS2)/(RS2 + RA2)
This is using an exponent of two, but extensive research by a sabermetrician who goes by the nom de plume of Patriot has shown that using a variable exponent based on the run scoring environment of the league does better. So we get a better estimate called Team PythagenPat W-L% = (RSx)/(RSx + RAx) where x = (runs/gm).285, so if the two teams average 4.50 runs per game each the exponent is x = 9.285 = 1.87, not far off from 2.0, but a little better. Additional info on PythagenPat.
To show the differences, I'm going to walk you through our calculations for Roy Halladay for 2011. The 2011 NL allowed .15447 runs/out or 26.8*.15447 = 4.001 runs/tmGame. In version 2.0, Halladay was 55.7 runs above average and 22 runs above replacement for 77.7 runs total. (A few other minor adjustments have changed this for version 2.1.)
Version 1.0, Runs/Win = 2 * (lg Runs/Game).715, for 8.28 runs per game we get, Runs/Win = 9.07 Runs/Win; 77.7/9.07 = 8.57 WAR.
Version 2.0 (introduced 2012-05-03) added an adjustment to (lg Runs/Game) based on the number of runs better than average the pitcher was. This models the effect of the pitcher on the run scoring environment.
For batters, RunsPerGameForBothTeams = 53.6 * (Lg Runs Per Out + (Runs_bat_player + Runs_dp_player + Runs_br_player + Runs_position_player - Runs_defense_player)/(6 * Innings_Estimate_player). This can be viewed as Lg Runs per Out + Player Runs added per Out - Player Runs subtracted per Out.
Innings Estimate = Greatest of (2.1*PA,Actual Inn Played if avail. or 8*G if not). In all likelihood we'll overestimate the number of innings played for pre 1953 seasons, but that will only cause the runs to win conversion to be more conservative for that player if they were a full-time PHer or something like that.
For Pitchers, we do something similar. One twist is that we compute outs recorded per game for the season (capped at 26.8) and then pad the remainder of the game with league average run prevention. This way we more accurately model how the player affects the runs to win conversion. We still start with Runs/Win = 2 * (RunsPerGameForBothTeams)^.715.
RunsPerGameForBothTeams = (53.6 - 3*Inn_pitcher/G_pitcher) * Runs_per_out_lg + (3*Inn_pitcher/G_pitcher) * Runs_per_out_pitcher
For Halladay, he was 59.7 runs better than average, so we take 59.7/701= 0.085 runs/out better than avg. And RunsPerGameForBothTeams = (53.6 - 701/32)*.15447 + 701/32*(.15447-.085) = 6.45 runs. Plugging that into the Runs/Win = 7.58 runs per win for specifically Halladay, so his WAR is 77.7/7.58 = 10.25.
As you can see just changing the estimator in this way has jumped him by 1.5 WAR, and has even larger effect on crazy seasons like 1985 Gooden or 1913 Walter Johnson. This is almost certainly too agressive. The issue is that the formula is really just an approximation of the underlying system, and approximations can break down in cases the approximation didn't anticipate. This estimator would work for a case where two teams averaging 3.225 runs/game would face each other, but when a 4.0 team faces a 2.45 team it gives us a value much too low, so we need a better way to convert from runs to wins.
Version 2.1, the underlying system is the that of runs added and subtracted changing the number of wins for a team. This is modeled very well by the PythagenPat formula cited earlier. To apply this here, we find the number of runs per game the player is better or worse than average. In this case, for Halladay, we have 59.7 runs in 32 games or 1.866 runs/game. So our exponent X = (53.6*.15447 - 1.866).285=1.698. And then if we set RS=4.14 and RA=4.14-1.866 = 2.27 and plug into the PythagenPat formula, W-L% = .735, So an otherwise avg team facing an avg team should win 73% of the time with 2011 Halladay starting for them. Then to get wins above average, WAA = (.735-.500)*32 games = 7.52 WAA. We then do a similar calculation for a replacement player's runs allowed to get the difference between avg and replacement level and add that here to get WAR. For Halladay's # of innings pitched and run scoring environment this is 1.7 wins, so WAR = 1.7 + 7.52 = 9.22 WAR.
Halladay WAR: version 1.0, 8.57 WAR; version 2.0, 10.25 WAR; version 2.1, 9.22 WAR. As you can see this estimate makes a big difference, and with 2.1, I believe our estimate is very tight and technically correct.
A couple nice things about this new method. The W-L% is a display of WAR as a rate stat. Note this is just for games the player plays in, so a top starter will always have a higher W-L% than a top position player, but position players are in a lot more games. If you want the team's W-L% 162 game season, add in enough .500 game to get the season total. One other nice thing is that it handles correctly the effect of the player's runs on run environment and the differences between high and low run scoring environments. We've ditched the approximation for the real thing.
For position players, we add offensive times to the team's runs scored and subtract fielding runs from the runs allowed, so an offensive player will affect both sides of that interaction.
See the detailed discussion on Inside the Book Blog for the genesis of this technique and many thanks to the folks who helped me out over there.