A B-R user recently wondered about the source for our pre-1910 batter strikeout data (example), given that those stats were not officially kept track of until 1913 in the AL and 1910 in the NL. I posed the question to Pete Palmer, stat legend and season-data provider to Baseball-Reference, and here was his reply:
"The strikeout data came from Jonathan Frankel, who did a tremendous amount of work with a number of helpers checking box scores in various newspapers. He identified about 90% of NL batters and 80% of AL batters from 1897-1909. The results were then prorated for the remainder of the season. Work is continuing on digging up more boxes and also on 1910-12 AL.
I was surprised that Jonathan was able to find so much data. What happened is that the local papers often carried the strikeouts for their games, so it required volunteers all over the country to check the papers, plus some inter-library loans. It was a terrific undertaking."
It turns out that Jonathan has a blog where he posts updates about the progress of his batter strikeout research. He says the 1910 AL is 89% complete right now, and that he has begun work on the 1912 AL as well.
This entry was posted on Wednesday, April 13th, 2011 at 11:01 am and is filed under Administration, History, Mailbag, Stats. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.