Rob Neyer ran a recent column (ESPN.com - Rob Neyer - Sometimes they just get hurt) in which he made the following statement.
I mean, I had some evidence (speaking of pitcher injuries and pitches thrown). Craig Wright wrote brilliantly on the subject in his classic book, "The Diamond Appraised" (a book that Price has read), and there's certainly plenty of anecdotal "evidence" that throwing, say, 145 pitches in a game bodes ill for a pitcher's short-term performance, if not his long-term health. But the research has never been particularly comprehensive ... until now. If you're interested in this subject--or if you're interested in baseball at all -- then I refer you to the aforementioned "Baseball Prospectus," which is only the best baseball book you'll read this (or most any other) spring.And while I don't want to give everything away, BP's Keith Woolner and Rany Jazayerli make a convincing argument that high pitch counts tend to result in both short-term performance decline and long-term injury risk. This greatly simplifies Woolner's and Jazayerli's findings, and I expect to discuss them in greater depth at some future point. In the meantime, though, I urge you to buy the book, and I also urge the Mariners to stick with their program. Yes, pitchers are always going to get hurt, because throwing a baseball 90-plus miles per hour just isn't natural. But that doesn't mean you can't do anything about it.
A few weeks ago, January 26th to be exact, Rob Neyer commented that peer review would be a good thing for sabermetrics to have. In his latest column, he talks about the new PAP3 measurement, proposed by Keith Woolner and Rany Jazayerli, as a ground breaking metric for measuring pitcher use and abuse. Since I received my copy of Baseball Prospectus from Amazon.com, I've been working up some thoughts about PAP. And seeing how Rob Neyer has once again brought this topic to the fore, I'm making these opinions public--all in an effort, provide some of the "peer review" that Rob is asking for.
I have several concerns about the methods and studies Keith Woolner and Rany Jazayerli discuss in pages 505-516 of their 2001 edition.
Now, I'll take these points one by one.
Before I get started I just wanted to dispute a statement Rany made in the introduction, "Research dating back to Craig Wright's The Diamond Appraised has suggested a 100-pitch limit for developing pitchers." (p. 505)
This myth has gone on for long enough. Wright's suggested limits are far, far more lenient than anything Prospectus has mentioned in the past. Looking at Wright's actual words, here are "some recommendations" he makes on p. 211 of his book. I've added emphasis.
Wright's next five deal with conditioning and organizational policies like player promotion and an emphasis on throwing strikes.
Unless Rany is discussing 15-year-old starters, Wright comes nowhere close to an absolute 100-pitch limit. Back to PAP and PAP3.
PAP was introduced by Rany Jazayerli in the summer of 1998 on the Baseball Prospectus website. At least in print or online there was no study made to discuss its validity at predicting injuries or whether it was better than simple pitch counts at measuring pitcher overuse. It was hailed by many people as a good metric, but I personally was disappointed that no study appeared in the 1999 book and again in the 2000 book backing up the claims its authors and others had made for it.
In the 2001 book, Rany essentially retracted it as a method and admitted there was no evidence that PAP told us anything other than a pitcher's PAP. While I credit him for recanting his flawed metric, he should take some heat for creating a method out of thin air. The hoopla surrounding this method prevented valid work on pitch counts from going forward because everyone had to deal with the 900 lb. gorilla of PAP. While Don Malcolm's criticism of PAP may have been too strident for some tastes (you can't say Don doesn't call it like he sees it), he should feel somewhat vindicated that the method has since been removed from anyone's sabermetric toolkit.
In the latest Baseball Prospectus, Keith Woolner has undertaken two studies which generated a new method called PAP3. In the first study, he considered the performance of what he terms high endurance pitchers. These pitchers are generally the better than average pitchers. For his cutoff more than 50% of their starts must be longer than the league average outing. While his study is from the years 1988-1998, pitchers who would have been placed in this group last year included Livan Hernandez, Randy Johnson, Jon Lieber and Kris Benson. Pitchers not in this group include John Halama, Jeff Fassero, David Cone, Greg Maddux and Andy Ashby.
For these pitchers, Keith looked at the number of pitches thrown in a start and then compared the pitcher's performances for the 21 days following the start to the performance for the 21 days prior to the start. He calculated ratios for the post-start outings to the pre-start outings for Innings Pitched/Games Started (IP/GS) and Runs Allowed/Inning (RA) among others. Here are the results stated in the book in table form. A ratio of 1.00 indicates the value is unchanged, greater than one means it went up, and less than one it went down. I'm rounding to the nearest .005. I added a column with the total starts in that category from 1994-2000 (not just the high endurance pitchers).
Pitches RA IP/GS Total starts (1994-2000)
90-99 1.020 1.000 6317
100-109 1.010 1.000 6554
110-119 1.025 1.000 4725
120-129 1.015 1.005 2460
130-139 1.035 0.990 635
140-149 1.075 0.980 113 (R. Johnson 21, Clemens 7)
18614 total
Wghtd. Avg. 1.020 1.000
As Keith notes in his study the average pitcher sees an increase of 2% in runs allowed after every start, so this becomes our baseline. In his study, Keith then proceeds to fit curves to the RA line delineated by the numbers above and finds (Pitches - 100) ^ 3 is the best fit. While I agree that it does fit best of the choices he gives in the book, I find it interesting that there is no decay in the pitcher's performance (it's nearly ramrod flat) until you get above 130 pitches. Given this, wouldn't the simplest (and easiest to defend) choice be to set the cut off at 130 pitches rather than at 100. Do you see a difference between the effect of a 90-99 pitch start and a 120-129 pitch start? In fact, these pitchers did better following this length of start. Additionally, setting the penalty threshold at 130 would actually be supported by Craig Wright's work.
While there clearly is some decay over 130 pitches, the next question is, Is any of this significant? If we subtract off the 2% baseline we see that a 140-149 pitch start (something that happens 16 times a year, once a year for every other team) will increase the RA ratio by 5.5% over the next 21 days. This means a 4.28 RA (the average over the course of Keith´s study) rises to 4.52. (4.52 - 4.28)/9 comes out to an increase of 0.026 runs per inning. This appears significant, but think about what this means over the course of 21 days. In a five-man rotation with no off days this corresponds to 4 starts or roughly 24 innings (using league averages). Taking 0.026 times 24 innings you have an increase of 0.63 runs for the entire 24 innings. And that is after a 140-149 pitch start. The effect is about 30% of that for a 130-139 start.
Now if you are Jimy Williams and your Red Sox are leading the Yanks 1-0 in the top of the 9th, do you bring in Derek Lowe, who has saved to the two previous games, to face Jeter, O´Neill and Williams, do you stay with Pedro Martinez who is at 118 pitches and is working on a 14-strikeout shutout or do you bring in a rested Rich Garces? Looking at this evidence, there had better be some very serious extenuating circumstances if Pedro isn't on the mound in the ninth inning. One run in this situation is far more important than one potential run over his next four starts.
Rather than showing that pitcher abuse has immediate, dire consequences this data shows that pitchers are much more resilient than previously thought. While this doesn't speak to the long-term consequences, the short-term consequences of a single high pitch count start are essentially non-existent.
This leads us to a question. If a single high pitch start raises the number of runs allowed over the next four starts (24 innings) by less than one run, how valid is a method that bases its entire form on this effect? An effect that comes into play in a mere 0.5% (1 in 200) of all starts.
In part two of his study, Keith Woolner attempts to show that high PAP3 totals lead to greater incidence of injury. And he correctly posits that if PAP3 isn't any better than plain vanilla pitch counts then it isn't a worthwhile metric. To summarize his study (hopefully correctly), he finds a list of starters injured from 1988-1998 (years corresponding to available pitch count data) from the Sports Encyclopedia and then finds uninjured pitchers who have thrown a comparable number of career pitches by the same age. He then compares the career PAP and career pitch count values for the healthy and injured pitchers.
Given his stated goal, this is a very sound way to study it. I have one large concern about how he approached it though. The premise behind PAP is that isolated high pitch count outings are worse than low pitch count outings. For instance according to PAP3, starts of 120-80-120-80 are much, much (mathematically they are infinitely) worse than starts of 100-100-100-100. I don't believe Keith's study determines if that is in fact true. In using career pitch count totals to find comps rather than career pitch count totals and games started (in effect pitches per start), he is placing high workload short careers in the same bin with low workload long careers. For instance, Jason Bere had 7800 pitches by age 25. Keith compared him with pitchers who have had 7020 to 8580 pitches by age 25. Here is a list I came up with from 1994 to the present.
+-----------------+------+-----+-------+ | name | NP | GS | avgNP | +-----------------+------+-----+-------+ | Shawn Estes | 7338 | 71 | 103 | | Chan Ho Park | 7527 | 74 | 102 | | Justin Thompson | 7833 | 77 | 102 | | Joey Hamilton | 8003 | 79 | 101 | | Scott Karl | 8242 | 82 | 101 | | Glendon Rusch | 8073 | 81 | 100 | | Jason Schmidt | 8404 | 84 | 100 | | Steve Trachsel | 8016 | 82 | 98 | | Jason Bere | 7800 | 80 | 98 | | Steve Woodard | 7345 | 84 | 87 | +-----------------+------+----+--------+
Now the question I have is, Do these pitchers have a similar profile? At least in Steve Woodard's case I would say no. I think to do this study you really have to control for the average pitches thrown per start.
I do like the running window chart that Keith runs to show how workload has some correlation to injury, but I would have liked to see a similar chart for pitchers ordered by pitches per start and other variables as well.
I hope that Keith Woolner will make much of his data available to the general public. For instance, I would like to know why less than 30% of pitchers in the study have an above average ratio of PAP3 to career pitches. I'm guessing that Randy Johnson is such an outlier that he alone is skewing the mean far higher than the median. The median would be a far more robust measurement to use on page 513. If the data were available (for instance, name, GS, Pitches, PAP3, Injury), readers could test a number of the conclusions on their own.
While I applaud their effort and I hope that they will continue to produce more research on this topic, I don't think that Rany and Keith have illuminated the subject of pitcher usage quite as much as Rob Neyer believes when he dramatically states,
there's certainly plenty of anecdotal "evidence" that throwing, say, 145 pitches in a game bodes ill for a pitcher's short-term performance, if not his long-term health. But the research has never been particularly comprehensive ... until now.
I've included a table of starts in a couple of formats from 1994-2000, which contain number of pitches, groundball-flyball, batters faced and other goodies in my data section.
I've done some small studies myself on pitcher usage. I'm continuing to work on this area and hope to have some more information in the future on our new webzine, Baseball Primer.
http://www.bigbadbaseball.com/statofday/sotd_20000611.html
http://www.bigbadbaseball.com/statofday/sotd_20000426.html
http://www.bigbadbaseball.com/statofday/sotd_19990808.html
http://www.bigbadbaseball.com/articles/forman_19990913.html
Keith Woolner's Baseball Prospectus Feedback area
March 3, 2001 - Sean Forman (email)
Just a test here, so I know that it works.
March 3, 2001 - Voros McCracken (email)
One of the things that needs to be understood about studies relating to player injuries, is the immenseness of the relevant information that either cannot be controlled, or that we lack the expertise to account for.This doesn't mean we can't stydy it or draw conclusions from such studies, but that the results ought to be very clear and solid before drwaing conclusions from it.
I really want to see a study that uses things like Disabled List time as a variable. I'd also like to see, as Sean commented, the pitchers better controlled statistically so that the only statistical differences between the pitchers are either there individual pitch counts or total pitches thrown (preferrably both studies).
| You Are Here > baseball-reference.com > Outside the Box |
| Quick Index: Players | Teams | Leagues | Managers | Leaders | Awards | Postseason | Random |