. Linear Regression

Take SOrate, BBrate, HBPrate, MPrate for all pitchers

 SOrate  BBrate  HBrate  MPrate
  19.69    5.41    0.67  1.73
  13.46    7.65    0.39  1.25
  11.37    9.51    0.93  2.54
   ...      ...     ...   ...
  26.03   10.14    0.09  1.84
  14.05    7.36    0.78  1.16
   9.53    8.43    0.67  2.27

Find the (linear) equation that best matches the data

  • C1 * SOrate + C2 * BBrate + C3 * HBPrate + K = avg MPrate
  • Using linear regression, find a formula

  • Used all three-year spans with 300+ BF (i.e. 2000-2, 2001-3, etc. for K.Wood)
  • Divided knuckleballers (206) and non-knuckleballers (14,828)