Win Expectancy, Run Expectancy, Leverage Index and base-out Leverage Index charts were provided by Tom Tango of Inside the Book (and the Blue Jays and Mariners). He also provided feedback for the methods we used.
A description (written by Tom) can be found in this article. Typically these calculations involve a lot of math and simulation, but Tom does the heavy lifting for you.
All of these numbers assume average situations and average teams and we then compare what really happened to the average outcome, so we can determine if a player is/was above average when compared to the context of the situation. So zero values are average. Above zero is better than average and below zero is worse than average.
Given a particular inning, score, and base-out situation (for example, bottom of 3rd, home team down two with runners on 1st and 3rd and one out), we can estimate the probability of an average home team (and therefore the road team) winning the game (40% in the case above). This is the win expectancy. We have this for every single possible situation in a game: for all 24 base-out situations, for the top and bottom of each inning (extras are just repeats of the ninth inning), and all range of score differentials (-11 to 11, in this case any lead of more than 11 is treated as 11).
Given the win expectancy at the start of the play, we can then look at the change in the win expectancy at the end of the play (for example, assume a bases clearing triple in the above, bottom of 3rd, score tied, runner on third and one out, WE = 61%) and determine the difference. This difference is the win probability added to the batter and subtracted from the pitcher. In this case 21%. We sum these values across the entire game and season and produce the numbers you see on the site.
Note that in this case all of the credit goes to the batter and all of the blame goes to the pitcher. These stats do not assign credit or blame to the fielders or the baserunners (except as listed below), even though they surely deserve some of the credit or blame.
Within a game, there are plays that are more pivotal than others. We attempt to quantify these plays with a stat called leverage index (LI). LI looks at the possible changes in win probability in a give situation and situations where dramatic swings in win probability are possible (runner on second late in a tie game) have higher LI's than situations where there can be no large change in win probability (late innings of a 12-run blowout).
The stat is normalized so that on average the leverage is 1.00. In tense situations, the leverage is higher than 1.00 (up to about 10) and in low-tension situations the leverage is between 0 and 1.0.
Taking a step back from wins to runs. Just like we can estimate the probability of average teams winning a game in a situation, we can estimate how many runs an average team is likely to score in a given base-out situation (say one out and runners on the corners, 1.186 runs). And then add the actual runs scored on a play to the estimate of runs still to be scores (2 + .9420 in the case above). This change (2.942 - 1.186 = 1.755) is run expectancy is RE24 (run expectancy for the 24 base-out situations).
As with win probability all of the credit goes to the batter and pitcher (except as stated below).
It may at first seem paradoxical, but even holding the score and the inning constant, the various base-out situations have different leverages.
The highest leverage (boLI = 2.667) situation comes with two outs and the bases loaded. This is a do or die situation with possible run values ranging from 0 (an out) to 4.104 (grand slam + expected runs from future batters in the inning).
The lowest leverage (boLI=.407) situation comes with 2 out and the bases empty. At most you can score one run (which isn't likely) and even if the batter reaches, they still don't have much chance of scoring later in the inning since there are two outs.
If you think about this for a little while, you'll begin to realize that the underlying run environment (think 1968 vs. 2000) can dramatically affect the win expectancies and run expectancies. A three run lead for the Pirates in 1968 where teams were scoring 3.42 runs per game looks a lot different than a three run lead for the Royals in 2000 when teams were scoring 5.14 runs per game. These changes affect the WE, the RE, the LI and the boLI to some degree.
We also need to consider the ballpark environment as Coors Field and Petco Park are nearly as different as 1968 and 2000.
Luckily, via simulation you can produce win and run expectancies for a variety run scoring environments. In this case, Tom Tango provided us with expectancies for an underlying environment of 3.0, 3.5, 4.0, ...., 5.5, 6.0 runs per 27 outs. Note the examples above were for 4.5 runs/27 outs, we use per 27 outs not per game to account for extra inning, shortened games, walk offs, and games the home team doesn't bat in the 9th.
For a situation where we want to use 4.2 runs/27outs we interpolate or extrapolate as the case may be the value using the values at 3.0, 3.5, etc. You can also go outside of the 3.0 to 6.0 range if needed (in very rare cases).
Now that we can get WE and RE for a particular run scoring environment what environment should we use for a given park. There are a couple of options:
Baseball-Reference.com uses the last technique. I can see arguments for any of the above, but I feel that since we are comparing a player's performance to an "average team" the average team scores 4.48runs/27outs and in Petco they would score 3.90 runs/27outs.
I know that most of these numbers are available at FanGraphs.com and that our numbers differ from theirs. I believe the difference lies in the differing run environments we use for each park.
Now that we have a run environment for every park, we can put a WE and LE (and WPA and RE24) on all 9 million plays in our database and add them up.
As noted above we assess blame or credit to the pitchers and hitters alone, except for baserunning events. For cases where the batter makes no play: SB, CS, WP, PB, defensive indifference, balks, etc., we give all of the credit to the baserunner. For cases, where there is a combo play for which the batter does put the ball in play, K + SB, BB + WP, etc., we pretend first that the batter play occurred assess the WE and RE and then apply the baserunning play and credit the baserunner.
I haven't mentioned multiple baserunners. In the case of a baserunning play with two or more baserunners, we give all of the credit/blame to the lead baserunner whose state changes (advance a base or put out). There is a strong argument to be made for crediting the trailing runner some, but for me that is a level of complexity too far.7
plays all of the plays for which a pa
aLI average Leverage Index. The average leverage of all of the plays the player was part of. Can be for a game, season or career. Average leverage is 1.00.
WPA Win Probability Added. Sum of the differences in win expectancies for each play the player is credited with. Can be for a play, game, season, or career. This is denoted in wins and is of a similar scale to other wins-based statistics. It is highly dependent on the context in which a player played. Elite relievers (due to their high stress innings) may have as many WPA as starters which does not occur for stats like pitching linear weights. Note that it is relative to average, so a 0 WPA player is an average player.
WPA- negative Win Probability Added. Sum of the plays where the player's play decreased the team's chance of winning. Sum of plays where WPA < 0.
WPA+ positive Win Probability Added. Sum of the plays where the player's play increased the team's chance of winning. Sum of plays where WPA > 0.
WPA/LI Situational Wins. Sum of each plays WPA divided by the play's leverage index. SUM(WPA/LI) for all plays. This is similarly scaled to WPA, but removes the context from the outcome, so for this stat a player with 30 home runs all in blowouts would look very similar to a batter with 30 home runs all in tie games. They would look much different in WPA. Generally used for a season or career.
Clutch WPA divided by aLI - WPA/LI (just above). The context dependent WPA divided by the average leverage minus context-neutral situational wins. From my example in WPA/LI, the 30 HR hitter in tie games would have a clutch greater than zero and the 30 HR hitter in blowouts would have a clutch less than zero.
RE24 Runs Added by 24 base-out situations. Sum of the differences in run expectancies for each play the player is credited with. Can be for a play, game, season, or career. This is denoted in runs and is of a similar scale to other runs-based statistics like linear weights. It is somewhat highly dependent on the context in which a player played. A player with a lot of runners on base ahead of him has more of a chance to create RE24 than a batter who always comes up with the bases empty. It is relative to average, so unlike runs created an average player will have zero RE24.
REW Wins Above Average by 24 base-out situations. RE24 is taken and divided by an estimate of runs/win for this player's setting (discussed below). This is Wins Estimator that is relative to average, so it can be compared to something like linear weights Batting or Pitching wins, WPA or WPA/LI. Generally computed for a season or career.
boLI base-out leverage index. The average base-out leverage index for the 24 base-out situations the player batted or pitched in.
RE24/boLI Situational Runs above average for 24 base-out situations. Like WPA/LI, we SUM(RE24/boLI) for each of the plays the player was responsible for.
PHlev pinch hitting leverage index. The average leverage index of the plate appearances this player pinch hit. Followed by PH at bats.
LevHi, LevMd, LevLo Leverage of reliever entrances. Number of times the pitcher entered in a low, medium or high leverage situation. See our definition for each below.
PtchR,PtchW Pitching Runs and Wins. A Pete Palmer invention. This stat takes (LgERA * 9 / IP - ER allowed + URF) and modifies it by a park factor. URF is an estimate of how many unearned runs the pitcher was responsible for. It credits the pitcher with half of the unearned runs compared to that of a league average pitcher. Pitching Wins then converts Pitching Runs to wins using the Runs/Win factor mentioned above and demonstrated below.
BtRuns, BtWins Batting Runs and Wins. A Pete Palmer invention. This stat takes the batters component factors (like BB, outs, 1B, 2B, etc) and components the number of runs contributed compared to average. It uses a variable outs factor to handle different run environments. Batting Wins then converts Batting Runs to wins using the Runs/Win factor mentioned above and demonstrated below.
wWPA - winner's Win Probability Added - The win probability added or subtracted (if negative) by this single play from the eventual winning team's win expectancy.
wWE - winner's Win Expectancy - The current probability (after the play) of the eventual winner winning at this point in the game. Note these are rounded, so a probability of 100% before the last play means it is close, but not quite 100%.
bWPA - Batting Team's Win Probability Added - The win probability added or subtracted (if negative) by this single play from the batting team's win expectancy.
bWE - Batting Team's Win Expectancy - The current probability (after the play) of the batting team winning at this point in the game. Note these are rounded, so a probability of 100% before the last play means it is close, but not quite 100%.
Leverage Splits - High Leverage is a value over 1.5 (20% of plays). Medium is 0.7 to 1.5 (about 40% of plays). Low is less than 0.7 (about 40% of plays).
A couple of notes on the calculation of WPA/LI. Given the data we have, the LI's can approach zero for a number of cases. For instance, rounding to the nearest 100's digit gives a zero value for LI for 1.4% of the plays in 2008. This obviously won't work for computing WPA/LI, as we are dividing by zero, so we create an implied leverage.
After discussing this with Tom Tango, I decided compute the average WPA/LI for the event types in the db with leverage ≤ 0.20. We then update the leverage for the plays in the DB for the low-leverage plays, so that the WPA/LI for each of these plays matches the low leverage WPA/LI. It's a bit of a hack, but the effect on aLI is negligible and it makes WPA/LI a bit more meaningful as the play and game totals (and to a lesser extent the season totals) would suffer from a loss of precision error (you can't reliably divide two small floating point numbers) if you didn't do this.
One note, for games like the Rays in Orlando or the Astros in Milwaukee, we still use the standard Tampa or Houston run environment. We could probably use the Milwaukee number for Houston, but I'm just not sure it is worth it and in some cases like the Yankees in Shea, there are issues with applying the park factor to the team from the other league, so we just ignore those handful of cases.
The data I was provided for boLI was just for a single run environment of 5.0 runs/27outs. To get a pair of values that I could then use to find boLI for all run environments. I picked a set of high run scoring years and a set of low run scoring years.
I looked at three periods
provided by Tango: 1999-2002 with rpg=5
Low : 1968, 1972, 1967, 1971, 1963 with rpg=average of 3.5 and 4.0
High: 1994, 1996, 1999, 2000 with rpg=5
Then using the leverage tables provided by Tom Tango, I aggregated the data for those seasons to the 24 base-out situations to get the average leverage for each of those 24 base-out situations and got the following table.
5.00 r/27 3.75 r/27 5.00 r/27 | runners_on_bases | outs | boLI_tango | boli_low | boli_high | +------------------+------+------------+----------+-----------+ | 000 | 0 | 0.90 | 0.92 | 0.90 | | 000 | 1 | 0.65 | 0.63 | 0.65 | | 000 | 2 | 0.41 | 0.40 | 0.41 | | 100 | 0 | 1.43 | 1.60 | 1.43 | | 100 | 1 | 1.16 | 1.20 | 1.16 | | 100 | 2 | 0.78 | 0.79 | 0.78 | | 010 | 0 | 1.14 | 1.30 | 1.14 | | 010 | 1 | 1.19 | 1.33 | 1.20 | | 010 | 2 | 1.07 | 1.17 | 1.08 | | 001 | 0 | 0.97 | 1.09 | 0.94 | | 001 | 1 | 1.29 | 1.53 | 1.28 | | 001 | 2 | 1.26 | 1.36 | 1.25 | | 110 | 0 | 1.74 | 2.06 | 1.74 | | 110 | 1 | 1.80 | 2.02 | 1.81 | | 110 | 2 | 1.53 | 1.63 | 1.54 | | 101 | 0 | 1.45 | 1.70 | 1.42 | | 101 | 1 | 1.71 | 2.10 | 1.70 | | 101 | 2 | 1.64 | 1.72 | 1.64 | | 011 | 0 | 1.28 | 1.52 | 1.29 | | 011 | 1 | 1.44 | 1.66 | 1.43 | | 011 | 2 | 1.75 | 1.85 | 1.75 | | 111 | 0 | 1.82 | 2.31 | 1.85 | | 111 | 1 | 2.31 | 2.75 | 2.30 | | 111 | 2 | 2.60 | 2.84 | 2.58 |
I then used the boLI_low and boLI_high values to interpolate/extrapolate the boLI for all of the other run environments.
In general, ten runs contributed/lost contributes/detracts a single team win/loss. This is a good rule of thumb to remember, but we can do a bit better than this. For the creation of Pitching Wins, Batting Wins, and REW (Wins above average by 24 base-out situation) we use a runs to wins estimate created by Pete Palmer. The general form is always:
Wins = Runs / (Runs_per_Win)
The variations depend on how you calculate Runs_per_Win. Pete's method starts with 10 and then multiplies by the square root of the league average runs scored per inning (top and bottom) plus/minus a player factor. In a league with more run scoring, the value of a single run is deflated (as it takes more runs to win a game).
So far we have.
Runs_per_Win = 10 * SQRT( 6 * LgRuns/ LgOuts)
One subtle point here is that each batter and pitcher slightly changes the context they are in. Barry Bonds produced so many runs that he in a sense inflated the number of runs needed for a win. Johan Santana has the opposite effect as he keeps run scoring lower. We estimate this effect (in terms of runs per inning) as the change in runs divided by games played divided by 9 (innings).
Runs_per_Win = 10 * SQRT( 6 * LgRuns/ LgOuts + (PlayerRuns/PlayerGames)/9 )
Keep in mind that the PlayerRuns we are using (RE24 or Batting/Pitching Runs) are compared to average, so some batters are positive, some are negative and the same for pitchers.
Kudos if you made it all of the way here.