I’ve been enjoying The Neyer/James Guide to Pitchers in my spare moments over the past few days. I hope to post a review by next week, but one of the chapters in the book intrigued me. In the chapter “Lucky Bastards” Bill James rates the luckiest and unluckiest pitcher seasons and careers of all-time. When I first saw the chapter I got overly excited, because I thought it would use Defense Independent Pitching methods (DIPS or FIP) to analyze pitchers. As an economist, I find DIPS to be a fascinating tool for untangling the joint product of preventing runs. Unfortunately, although not all that unfortunate, James estimates the luck in Wins and Losses versus what the ERA would predict. Certainly, it is still an interesting exercise, but not what I was looking for. To me, pitchers Wins and Losses are so irrelevant that it is almost not worth discussion. Certainly, Wins are correlated with good pitching, so they contain some good information. But, they also contain bad information: the quality of the offense of the team, which is largely irrelevant to how a pitcher pitches. Since we have ERA, why bother looking at Wins? [Insert Joe Morgan jab here]
Rather than fret, I decided to develop my own list of best pitching seasons using DIPS as a theoretical motivator. The idea behind DIPS is to analyze how good pitchers are at preventing runs without relying on defense. We simply remove the balls put into play by hitters and judge pitchers on events that involve only the pitcher and batter. The original DIPS ERA, as discovered by Voros McCracken, involves a complicated formula that is a bit cumbersome for what I want to do. It includes calculations for handedness, knuckleballers, hit batters, etc. Tangotiger has developed FIP (or Fielding Independent Pitching), which focuses on the three main fielding independent statistics, walks, strikeouts, and HRs. However, Mr. Tiger calculates his number via linear weights, which I am not going to do here (and probably cause me a much deserved scolding). So here is what I have done.
First, I gathered all pitcher seasons (using the Lahman database) from 1921-2003 for pitchers who started more than 10 games. From this, using linear regression I estimated the impact of walks, strikeouts, and HRs (all normalized per 9 innings pitched) on ERA. Here is the estimate.
pERA = 2.45 + 0.38*BB – 0.19*K + 1.59*HR
I will call this stat pERA rather than dERA or FIP, to avoid confusion with these already established numbers. I only went back to 1921, because that is the year in which the spitball was abolished and HRs became a dominant part of the game. I can’t imagine what an 1880s DIPS ERA would even mean. I could have done different estimates for each year, because the variables may have slightly different impacts on ERA in different seasons. I decided instead on treating each pitcher’s numbers the same. I have done several estimates on different seasons and I find very little difference across seasons. Plus, I am curious in identifying good/bad DIPS seasons, and I don’t want to punish a pitcher for having a great strikeout year in a year when runs are scarce. Quibble if you want, but now it is on to step two.
Next, I ranked the pitcher-seasons by pERA. Here is the list of the top-25 pERA seasons in the modern era.
A few things jump right out and surprise me. First, this seems like a who’s who of very recent pitchers. Only two of the seasons on the list occurred before I was born, and I’m 30. Only five of these seasons occurred before 1990. Why? That is a puzzle I will leave alone for now, but I think it has to do with the rise of the strikeout. But there is more. Pedro has thrown six of the top-25 pERA seasons ever, and 3 of the top-5. Greg Maddux comes in second with 4 seasons. Randy Johnson and Roger Clemens tie for third with 3 seasons. Four men have pitched 16 of the top-25 pERA seasons ever.
For comparison here is the list of the top-25 ERA seasons of all-time for this sample of pitchers.
Next, I want to identify the luckiest and unluckiest ERA seasons in terms of pERA. At the heart of DIPS is the idea that batting average on balls-in-play is random. One of McCracken’s most shocking findings was that the previous season’s DIPS ERA is a better predictor of ERA and the previous season’s ERA because the in-play average distorts a raw ERA. In addition, some pitchers may be helped out by good defense. I simply want to see what pitchers have been able to accomplish on their own relative to their ERA, which is helped or hurt by random chance and defense. To do this I calculate the ratio of the pERA to actual ERA in a season, which I label as pRatio. As the number falls below 1 the pitcher has been unlucky with outside factors, while as the number rises above one the pitchers has been more lucky. Maybe I shouldn’t call this measure “luck” but what it does tell us how much the actual ERA is helped or hurt by non-DIPS factors. So here is the list of the top-25 “luckiest” pitchers in the sample.
Five of these men appear on the top-25 ERAs of all-time in the sample (Munger, Rogers, Witt, Alexander, and Gibson), four of them with pERAs of greater than three. Here are the unlucky top-25.
I would call this the list of guys you could have stolen for your fantasy team after these years. I seem to recall my grandfather telling me how he jobbed everyone in his league by picking up Lefty Grove after the 1934 season when everyone thought he was finished .
I have a few final words. I wanted to at least make the HRs park-neutral, but I do not have easy access to such a large number of HR park-factors over this time period. Also, I did nothing to modify ERA for years, such as calculating ERAs relative to the average. I did this, well, because I wanted to look at raw ERA numbers. I am more interested in the comparison to of the pERA to the raw number that we are oooing and ahhhing over. Anyway, so these are the lists I wanted to see. I hope you enjoy, and please feel free to send thoughts and suggestions. I am certainly not married to the list.