## DIPS: The Best, the Lucky, and the Not-So-Lucky

I’ve been enjoying The Neyer/James Guide to Pitchers in my spare moments over the past few days. I hope to post a review by next week, but one of the chapters in the book intrigued me. In the chapter “Lucky Bastards” Bill James rates the luckiest and unluckiest pitcher seasons and careers of all-time. When I first saw the chapter I got overly excited, because I thought it would use Defense Independent Pitching methods (DIPS or FIP) to analyze pitchers. As an economist, I find DIPS to be a fascinating tool for untangling the joint product of preventing runs. Unfortunately, although not all that unfortunate, James estimates the luck in Wins and Losses versus what the ERA would predict. Certainly, it is still an interesting exercise, but not what I was looking for. To me, pitchers Wins and Losses are so irrelevant that it is almost not worth discussion. Certainly, Wins are correlated with good pitching, so they contain some good information. But, they also contain bad information: the quality of the offense of the team, which is largely irrelevant to how a pitcher pitches. Since we have ERA, why bother looking at Wins? [Insert Joe Morgan jab here]

Rather than fret, I decided to develop my own list of best pitching seasons using DIPS as a theoretical motivator. The idea behind DIPS is to analyze how good pitchers are at preventing runs without relying on defense. We simply remove the balls put into play by hitters and judge pitchers on events that involve only the pitcher and batter. The original DIPS ERA, as discovered by Voros McCracken, involves a complicated formula that is a bit cumbersome for what I want to do. It includes calculations for handedness, knuckleballers, hit batters, etc. Tangotiger has developed FIP (or Fielding Independent Pitching), which focuses on the three main fielding independent statistics, walks, strikeouts, and HRs. However, Mr. Tiger calculates his number via linear weights, which I am not going to do here (and probably cause me a much deserved scolding). So here is what I have done.

First, I gathered all pitcher seasons (using the Lahman database) from 1921-2003 for pitchers who started more than 10 games. From this, using linear regression I estimated the impact of walks, strikeouts, and HRs (all normalized per 9 innings pitched) on ERA. Here is the estimate.

pERA = 2.45 + 0.38*BB – 0.19*K + 1.59*HR

I will call this stat pERA rather than dERA or FIP, to avoid confusion with these already established numbers. I only went back to 1921, because that is the year in which the spitball was abolished and HRs became a dominant part of the game. I can’t imagine what an 1880s DIPS ERA would even mean. I could have done different estimates for each year, because the variables may have slightly different impacts on ERA in different seasons. I decided instead on treating each pitcher’s numbers the same. I have done several estimates on different seasons and I find very little difference across seasons. Plus, I am curious in identifying good/bad DIPS seasons, and I don’t want to punish a pitcher for having a great strikeout year in a year when runs are scarce. Quibble if you want, but now it is on to step two.

Next, I ranked the pitcher-seasons by pERA. Here is the list of the top-25 pERA seasons in the modern era.

Rank Last First Team Year ERA pERA
1 Martinez Pedro BOS 1999 2.07 1.11
2 Martinez Pedro MON 2001 2.39 1.38
3 Johnson Randy ARI 1998 1.28 1.80
4 Martinez Pedro MON 2000 1.74 1.81
5 Brown Kevin TEX 1998 2.38 1.82
6 Gooden Dwight NYN 1984 2.60 1.86
7 Maddux Greg ATL 1995 1.63 1.88
8 Clemens Roger NYA 1997 2.05 1.90
9 Johnson Randy SEA 1995 2.48 1.92
10 Maddux Greg CHN 1994 1.56 1.92
11 Johnson Randy SEA 2001 2.49 1.94
12 Martinez Pedro BOS 2003 2.22 1.94
13 Maddux Greg CHN 1997 2.20 1.98
14 Martinez Pedro MON 2002 2.26 2.00
15 Halladay Roy TOR 2001 3.16 2.09
16 Richard J.R. HOU 1980 1.90 2.09
17 Clemens Roger BOS 1990 1.93 2.11
18 Gibson Bob SLN 1968 1.12 2.14
19 Martinez Pedro BOS 1997 1.90 2.16
20 Gullickson Bill CIN 1981 2.80 2.20
21 Koufax Sandy LAN 1963 1.88 2.21
22 Brown Kevin TEX 1996 1.89 2.24
23 Clemens Roger BOS 1988 2.93 2.27
24 Maddux Greg ATL 1996 2.72 2.27
25 Prior Mark CHN 2003 2.43 2.27

A few things jump right out and surprise me. First, this seems like a who’s who of very recent pitchers. Only two of the seasons on the list occurred before I was born, and I’m 30. Only five of these seasons occurred before 1990. Why? That is a puzzle I will leave alone for now, but I think it has to do with the rise of the strikeout. But there is more. Pedro has thrown six of the top-25 pERA seasons ever, and 3 of the top-5. Greg Maddux comes in second with 4 seasons. Randy Johnson and Roger Clemens tie for third with 3 seasons. Four men have pitched 16 of the top-25 pERA seasons ever.

For comparison here is the list of the top-25 ERA seasons of all-time for this sample of pitchers.

Rank Last First Team Year ERA pERA
1 Gibson Bob SLN 1968 1.12 2.14
2 Johnson Randy ARI 1998 1.28 1.80
3 Munger Red SLN 1944 1.34 3.06
4 Alexander Doyle NYA 1987 1.53 3.08
5 Gooden Dwight DET 1985 1.53 2.30
6 Rogers Steve MON 1973 1.54 3.40
7 Maddux Greg CHN 1994 1.56 1.92
8 Tiant Luis BOS 1968 1.60 2.53
9 Witt George PIT 1958 1.61 3.30
10 Maddux Greg ATL 1995 1.63 1.88
11 Chandler Spud NYA 1943 1.64 2.54
12 Chance Dean CLE 1964 1.65 2.58
13 Hubbell Carl NY1 1933 1.66 2.37
14 Ryan Nolan CAL 1981 1.69 2.57
15 Koufax Sandy LAN 1966 1.73 2.41
16 Guidry Ron NYA 1978 1.74 2.46
17 Koufax Sandy LAN 1964 1.74 2.37
18 Martinez Pedro MON 2000 1.74 1.81
19 Pollet Howie SLN 1943 1.75 2.72
20 Seaver Tom NYN 1971 1.76 2.33
21 Cooper Mort BSN 1942 1.78 2.80
22 Eldred Cal ML4 1992 1.79 2.73
23 Newhouser Hal CLE 1945 1.81 2.71
24 McDowell Sam DET 1968 1.81 2.72
25 Blue Vida SFN 1971 1.82 2.62

Next, I want to identify the luckiest and unluckiest ERA seasons in terms of pERA. At the heart of DIPS is the idea that batting average on balls-in-play is random. One of McCracken’s most shocking findings was that the previous season’s DIPS ERA is a better predictor of ERA and the previous season’s ERA because the in-play average distorts a raw ERA. In addition, some pitchers may be helped out by good defense. I simply want to see what pitchers have been able to accomplish on their own relative to their ERA, which is helped or hurt by random chance and defense. To do this I calculate the ratio of the pERA to actual ERA in a season, which I label as pRatio. As the number falls below 1 the pitcher has been unlucky with outside factors, while as the number rises above one the pitchers has been more lucky. Maybe I shouldn’t call this measure “luck” but what it does tell us how much the actual ERA is helped or hurt by non-DIPS factors. So here is the list of the top-25 “luckiest” pitchers in the sample.

Rank Last First Team Year ERA pERA pRatio
1 Munger Red SLN 1944 1.34 3.06 2.280
2 Rogers Steve MON 1973 1.54 3.40 2.211
3 Witt George PIT 1958 1.61 3.30 2.048
4 Alexander Doyle DET 1987 1.53 3.08 2.012
5 Melton Rube BRO 1946 1.99 3.90 1.958
6 Gibson Bob SLN 1968 1.12 2.14 1.911
7 Hearn Jim SLN 1950 1.94 3.66 1.884
8 Benton Al DET 1949 2.12 3.95 1.862
9 Craig Roger BRO 1959 2.06 3.81 1.851
10 Holtzman Ken CHN 1967 2.53 4.61 1.823
11 Jay Joey ML1 1958 2.14 3.83 1.789
12 Fitzsimmons Freddie NY1 1941 2.07 3.64 1.760
13 Dickerman Leo SLN 1924 2.41 4.22 1.749
14 Mahaffey Art PHI 1960 2.31 4.04 1.747
15 Candelaria John PIT 1977 2.34 3.99 1.706
16 Beggs Joe CIN 1946 2.32 3.93 1.695
17 Antonelli Johnny SFN 1954 2.30 3.89 1.692
18 Dues Hal MON 1978 2.36 3.99 1.691
19 Benton Al DET 1945 2.02 3.41 1.687
20 Chandler Spud NYA 1942 2.38 4.00 1.679
21 Pierce Billy CHA 1955 1.97 3.31 1.678
22 Pollet Howie SLN 1946 2.10 3.50 1.668
23 Howard Bruce CHA 1966 2.30 3.82 1.659
24 Benes Andy ARI 2002 2.78 4.58 1.647
25 Bearden Gene CLE 1948 2.43 3.98 1.639

Five of these men appear on the top-25 ERAs of all-time in the sample (Munger, Rogers, Witt, Alexander, and Gibson), four of them with pERAs of greater than three. Here are the unlucky top-25.

Rank Last First Team Year ERA pERA pRatio
1 Mendoza Ramiro NYA 1996 6.79 3.33 0.491
2 Grove Lefty BOS 1934 6.50 3.42 0.526
3 Martinez Pedro BOS 1999 2.07 1.11 0.536
4 Lieber Jon PIT 1995 6.32 3.41 0.540
5 Rodriguez Frank MIN 1998 6.56 3.61 0.550
6 Frey Benny CIN 1935 6.85 3.79 0.553
7 Babich Johnny BRO 1935 6.66 3.72 0.559
8 Zachary Chris SLN 1971 5.32 2.99 0.562
9 Bowie Micah CHN 1999 9.96 5.63 0.565
10 Rusch Glendon NYN 2003 6.42 3.67 0.571
11 Berenyi Bruce CIN 1984 6.00 3.46 0.577
12 Martinez Pedro MON 2001 2.39 1.38 0.577
13 Elliott Hal PHI 1930 7.67 4.44 0.579
14 Leverett Dixie CHA 1929 6.36 3.73 0.587
15 Donohue Pete CIN 1930 6.13 3.63 0.592
16 Smith Zane ATL 1995 5.61 3.33 0.593
17 Kolp Ray CIN 1924 5.68 3.40 0.599
18 Irabu Hideki NYA 2000 7.24 4.35 0.601
19 Halladay Roy TOR 2000 10.64 6.41 0.602
20 Gubicza Mark KCA 1991 5.68 3.45 0.607
21 Pascual Camilo WS1 1955 6.14 3.76 0.612
22 Pruett Hub PHI 1927 6.05 3.71 0.613
23 Burkett John SFN 1998 5.68 3.49 0.614
24 Blankenship Ted CHA 1924 5.01 3.08 0.615
25 Holloway Ken DET 1926 5.12 3.15 0.615

I would call this the list of guys you could have stolen for your fantasy team after these years. I seem to recall my grandfather telling me how he jobbed everyone in his league by picking up Lefty Grove after the 1934 season when everyone thought he was finished ;-).

I have a few final words. I wanted to at least make the HRs park-neutral, but I do not have easy access to such a large number of HR park-factors over this time period. Also, I did nothing to modify ERA for years, such as calculating ERAs relative to the average. I did this, well, because I wanted to look at raw ERA numbers. I am more interested in the comparison to of the pERA to the raw number that we are oooing and ahhhing over. Anyway, so these are the lists I wanted to see. I hope you enjoy, and please feel free to send thoughts and suggestions. I am certainly not married to the list.