At one time I was writing a lot about the Roger Clemens sagam but I don’t have the energy or time to write about it as I once did. But, thanks to Internet search engines I’ve been getting several hits relating to Clemens. The problem with reading old posts is that it’s possible to interpret what I’ve said out of context. So, I thought I’d link to several old posts on the issue.
I’ll be attending SABR 40 for the next few days. If you would like to talk, please approach me and introduce yourself. This is my first SABR convention.
I am giving two research presentations, one oral and one poster.
Resting the pitcher: How useful are pitch counts and days of rest? (with Sean Forman): Thursday, August 5, 2:30 – 2:55pm, Georgia 7,8,9
Many individuals believe that limiting pitch counts and increasing days of rest can improve performance and reduce injuries. Though the belief that overuse can hamper pitchers is widespread, there exists little evidence that adjusting pitch counts and rest has much effect on pitcher performance. In this study, Bradbury and Forman use newly available game-level pitch count data from 1988 to 2009 to evaluate the impact of pitch counts and rest days on future performance. They discuss their employment of linear and non-linear multiple regression analysis techniques to estimate the impact of pitch counts — in recent games and cumulatively over a season — and days of rest on pitcher performances while controlling for the effects of other factors.
Putting a dollar sign on the muscle: What are baseball players worth?: I’ll be by the poster on Thursday from 4-6pm.
In the 1970s, using team revenue and player performance data, Gerald Scully employed the standard marginal revenue product framework frequently used by labor economists to estimate the financial contributions of players. Bradbury’s study employs new information about baseball’s economic structure and sabermetric performance metrics in an updated Scully framework to estimate the dollar value of current major league baseball players. He compares player salaries and estimated worth by service class, presents a method for projecting player worth over the duration of long-term contracts, identifies some of baseball’s best and worst deals, and ranks teams according to their abilities to manage their budgets.
For more see my forthcoming book Hot Stove Economics.
Buster Olney is reporting that Corey Hart and the Milwaukee Brewers have agreed to a three-year $26.5 million extension. The contract buys out two years of Hart’s free agency—a fact that is easy to find thanks to Baseball-Reference’s new service time data.
Hart is having the best year of his career and has been a solid player for most of his career. I estimate Hart to be worth approximately $33 million over the term of his deal. Given his year of arbitration coming off a good year, he would probably have been looking at a $7–9 million contract in 2011. With this deal, Hart gets some insurance and the Brewers get a small discount for two free-agent years near his peak.
In today’s Slate, Ray Fishman says yes. I disagree. What follows is repeat of a post that I wrote two years ago on the study that Fishman believes supports the effect. (Thanks to Alex for the pointer.)
— — —
Last week, I became aware of a study by economists Eric Gould and Todd Kaplan that evaluates at the impact of Jose Canseco on his teammates. They examine the belief that Canseco distributed his knowledge about steroids throughout baseball by introducing many of his teammates to performance-enhancing drugs. If this was the case, the authors hypothesize that he ought to have left a trail of improved performance among teammates in his wake.
The authors look at the careers of Canseco’s teammates to investigate this claim. Their method is to examine players to see how well they perform as a Canseco teammate and afterwards, relative to the years preceding involvement with Canseco. The idea is somewhat similar to what I did with my analysis of Leo Mazzone’s impact on pitchers (see Chapter 5 of my book).
After reading the study, I am not convinced by the authors’ conclusions. It’s not just one thing, but a collection of issues that form my opinion. I have problems with both the study’s design and the interpretation of the reported results. My disagreement does not mean that the effect does not exist, only that I do not see a pattern consistent with Canseco spreading steroids to his teammates.
First, I want to start with the sample. The authors look at players from from 1970–2003. I find this an odd range of seasons to select. Canseco’s career spans from 1985–2001. Why start a decade and a half before Canseco enters the league and stop two years after he exits? The asymmetry bothers me largely because the run-scoring environment preceding Canseco was much lower than it was during the latter part of his career. But even without this, it is a strange choice to make. I can only guess that there is some teammate of Canseco’s whose career extends back this far, but I still don’t agree with the choice. And why not extend the sample until the present?
Next, the authors set the cutoffs for examining player performance at 50 at-bats for hitters and 10 games for pitchers. These minimums are far too low even when stats are normalized for playing time, but the impact is much worse when looking at absolute statistics like total home runs, which the authors do. For pitchers—who I will not examine here—it’s possible to get pitchers who pitched very few innings.
The authors also make a strange choice to break hitters into two classes: power and skilled players. The idea is that we might see different effects on the different styles of play. I don’t agree with this, but that is not the weird part. The way they differentiate power and skilled players is by position played, weird but moderately defensible. The power positions are first base, designated hitters, outfielders, and catchers. The skilled positions are second base, shortstop, and third base. And it becomes clear that the authors are not all that familiar with baseball. Catcher is a “power” position? Third base is a skill position? I suspect that the catcher and shortstop positions produce the least offense of all the positions. Sure, you can point to a power-hitting catcher like Mike Piazza, but you can also point to a punchless first basemen like Doug Mientkiewicz, but in general catcher and first base are at the opposite ends of defensive skill with very different offensive expectations. Center field is also a defensive position that should not be lumped in with the corner positions. This highlights the problem of separating power potential by position. And, it’s not so much that the way that the sample spliced—which don’t like—but the fact that it is being spliced at all makes me suspicious.
The choice of dependent variables is also bit strange. While the authors are mainly looking for changes in power, they pick only a few metrics that measure power: HR, SLG, and HR/AB. The other statistics include AVG, RBI, K, BB, IBB, at-bats, fielding percentage, errors, and steals. I have no problem with AVG. RBI is completely useless since it is largely dependent on teammates. K, BB and IBB are chosen because they correlate with home run hitting. But, performance in this area is also correlated with other things such as plate discipline, and the authors are already looking at home runs. This just adds columns to the regression table, that would have been better-used doing robustness checks on the sample and control variables. I would have liked to have seen isolated power (SLG–AVG), HR/H, OBP, and OPS.
As for the control variables, many of the choices are not intuitive. The batting average of the division (subtracting out own-team performance), the manager’s lifetime winning percentage, the batting park factor, years of experience (listed as a continuous variable in the text, but reported as a matrix of dummies in the regression tables), year effects, and dummies for each division. Also, the equation is estimated with fixed effects to control for individual player attributes.
I wouldn’t have chosen some of these same variables, but I don’t think they make much difference. However, I am perplexed by the inclusion of manager’s winning percentage and division dummies. I don’t see any obvious potential bias from the quality of the manager. In any event, managerial dummies are probably the better choice. Mangers with players who perform better will have higher winning percentages, so a positive correlation is to be expected, but the causality is difficult to determine. However, this isn’t a huge issue.
The division dummies make no sense. The divisions changed their compositions at several points during the sample—the most extreme change occurs when a Central Division was added to both leagues in 1994—and there are no common rules or kinds of play that are really unique to any division. If there was such an effect, the batting average of the division and year effects should catch this. It would have made more sense to include league dummies, because of the significant differences in play between the leagues after the introduction of the DH in 1973. In any event, the authors state that the control variables do not alter the results. I would have liked to see some results with different controls.
Now, to the variable(s) of interest. When I initially looked at the study, flipped to the regression tables first and noticed that there did not appear to be a “Canseco effect,” because the estimate on playing with Canseco was not statistically significant. But, that is not what the authors use to quantify Canseco’s impact; we are supposed to look at a second variable that identifies the seasons after playing with Canseco. The intuition is that “even if he did learn steroids from Canseco, we do not know when he learned about it during his time with Canseco, but we can be sure that he already acquired the knowledge after player with Canseco” (p. 10). I just don’t buy this. I understand that it might take a while for the effect to kick in, but this should still manifest itself in the “played with” variable, especially because many players played with Canseco for multiple seasons. At best this story makes sense only for guys who might have played one season with Canseco (more on this below). Second, anabolic steroids work quickly, so it’s unlikely that there would be a delayed effect.
After reading the paper, I came to the conclusion that the results are probably fragile. So, I designed a similar, but not identical, dataset. I did almost everything the authors did, except I did not break the sample into power and skilled players, and I included league dummies instead of division dummies, because I feel this is a superior choice. I also kicked out some partial seasons when guys switched teams to make life easier in developing the dataset. Thus, what I am doing is “replication” in the sense of looking for a similar result in the data, rather than trying to recreate the previous estimates. If the result is real, then I should find something similar. Here is what I found looking at raw home run totals (control variable estimates not reported).
HR HR HR/AB HR/AB AR(1) 50 AB 200 AB 50 AB 200 AB Corrected With Canseco -0.297 -0.199 -1.28E-03 -9.39E-04 -0.449 [0.66] [0.35] [1.41] [0.93] [0.87] After Canseco 0.667 0.737 3.49E-04 6.28E-04 -0.204 [1.58] [1.34] [0.41] [0.65] [0.34] Observations 15,644 9,234 15,644 9,234 12,759 Players 2,885 1,717 2,885 1,717 2,265 R-squared 0.13 0.14 0.09 0.13 0.08 Absolute value of t statistics in brackets
The coefficient on for playing with Canseco is negative and insignificant and the after Canseco coefficient is positive with a p-value of 0.12, which is above the standard (0.05) and lenient (0.1) thresholds for statistical significance. That is the best that I could get. When I up the at-bat minimum to the more appropriate 200, normalize home runs for at-bats, and both, “played with” is negative and never significant, and “after’s” p-value is never as low as it was in the specification that most-closely resembles the study. Another potential problem that I encountered was serial correlation in the data. This is sometimes difficult to detect, and it is possible that it is a problem unique to my sample. However, when I correct for the problem, both Canseco variables consistently have high insignificant p-values. So, though the authors find some evidence of an effect in the after variable in their sample, the finding appears not be all that robust.
The one thing that bothers me most about this study is that we have to interpret why the “after Canseco” variable is important, but the “during” variable is not as important. And I think the author’s story really only applies to players who are with Canseco for one season. So, I ran some regressions using players who played with Canseco for only one year.
One-year One-year 10+ Career With Canseco -2.656 -3.450 [3.02]** [3.17]** After Canseco -2.562 -3.027 [2.84]** [2.95]** Observations 1,200 940 Players 186 100 R-squared 0.18 0.23 Absolute value of t statistics in brackets * significant at 5%; ** significant at 1%
The effects of during and after playing with Canseco are strongly negative, about 2.5 less homers. However, if they only played on year with him it could reflect that these players were not very good and were on their way out of the league. So, I limited the sample to players with careers of 10 or more seasons; and, the result is a decline in homers of about 3 HRs both with and after.
My point of offering this “replication” isn’t so much to say that my specifications are superior. I just want to show that the findings do not appear to be robust. To concur with the conclusions presented in the study you have to interpret the findings in a way that I do not believe is correct. Upon further examination, I believe the significant effect on home runs after playing with Canseco identified in the Gould and Kaplan study is a product of spurious correlation, and thus this tells us little about Canseco impact on disseminating steroids throughout baseball.
Yesterday, the Philadelphia Phillies acquired Roy Oswalt for J.A. Happ, Anthony Gose, and Jonathan Villar from the Houston Astros. The general reaction of to the deal has been quite negative toward the return to Houston. Oswalt is an ace starter. Gose (whom the team immediately sent to Toronto for Brett Wallace) and Villar are low-minors prospects. How could Astros GM Ed Wade get so little in return?
It’s interesting that several years ago Ed Wade was on the other side of one of these supposed heists for the Phillies, acquiring Kevin Millwood for Johnny Estrada. It just so happens that the opening chapter of my book on valuing players Hot Stove Economics (forthcoming in October) is titled: “Why Johnny Estrada Is Worth Kevin Millwood: Valuing Players as Assets.” In the chapter, I explain how players so different in ability were swapped for each other without bringing stupidity into the equation. The difference was their salary requirements. While I can’t go into the details here, Millwood would receive $11 million the following year (which was way more than his expected worth), while Estrada would get less than $1 for two years of service before being traded for two relievers.
Now we have Oswalt, who is owed just under $25 million for the remainder of his contract (the Phillies are kicking in and extra million for the buyout of his option). The Phillies are winning team and thus value his performance much more than the Astros, because there are increasing returns to winning. I estimate his expected performance through 2011 is worth about $28 million to the Phillies—a little over $3 million more than his salary obligations. The Astros are also sending along $11 million, which seems excessive until you remember the prospects. A year ago, Happ was pitching decently in the majors, and considered untouchable commodity. He’s been injured, but injuries heal. Let’s assume that Happ pitched at his true performance last year for the Phillies. Based on the value of his performance, and his expected salary obligations (one more purely-reserved year and three arbitration years), I estimate he’s worth about $12 million (discounted present value of performance for four years under team control). But, his injury risk lowers his expected return somewhat. Then the other prospect come into the deal. They are at such a low level that I won’t try and project them from minor league stats (stats below high-A are close to useless), but they certainly have value.
My point here isn’t to calculate the exact value to see whom got the better end of this deal. I want to understand why this trade was made. And while a lot of people aren’t high on Ed Wade, he and his baseball people have some sense of what players are worth. I think the deal is defensible from the Astros perspective, especially considering that Oswalt has some post-season value to the Phillies that the Astros can’t capture without trading him to where his services are more highly valued.
Addendum: Bottom line, Osawlt is the superior player, but expensive. Happ and the other prospects are inferior, but cheap.
Thomas Lake has a nice retrospective article on Bobby Cox’s ejections in the current issue of Sports Illustrated. If you have read it, you might have seen my brief contribution.
FEW HUMAN endeavors have been studied so closely by so many people with such fascination for such a long time as the game of baseball. Historians, economists and statisticians scrutinize everything that happens and compare it with everything else that already happened, going back to 1871. This ocean of numbers can tell us a lot about Bobby Cox. For example: He makes pitchers better. J.C. Bradbury, author of the 2008 book The Baseball Economist: The Real Game Exposed, looked at pitchers who had thrown for multiple teams and compared their performances for Cox with their performances for other teams. Using a sophisticated technique called multiple regression analysis, Bradbury factored out variables such as hitter-friendly ballparks, league ERA differences, team defense and pitchers’ ages. What remained was a meaningful Cox Effect, worth about a quarter of a run every nine innings. (True, the Leo Mazzone Effect was even larger, but the Cox Effect existed even in the 14 years Mazzone wasn’t Cox’s pitching coach.)
I looked at pitchers with more than 30 innings pitched in a season and hitters with more than 100 plate appearances who played for Bobby Cox and at least one other manager. The tables below report the estimates. The performance numbers are park corrected.
ERA Bobby Cox -0.256 (3.95)** Career ERA 0.833 (16.36)** LgERA 0.249 (2.71)** Tm BABIP 10.839 (4.12)** Age -0.341 (6.10)** Age2 0.006 (6.28)** Constant 1.686 (1.61) Observations 1519 R-squared 0.29 Robust t statistics in parentheses * significant at 5%; ** significant at 1%
OPS Bobby Cox -0.006 (1.24) Career OPS 0.935 (42.88)** LgOPS 0.415 (6.48)** Age 0.028 (4.98)** Age2 -0.00046 (5.01)** Constant -0.670 (7.00)** Observations 1833 R-squared 0.52 Robust t statistics in parentheses * significant at 5%; ** significant at 1%
PrOPS Leaders Player Team PrAVE PrOBP PrSLG PrOPS 1 Carlos Gonzalez COL 0.321 0.411 0.572 0.984 2 Miguel Cabrera DET 0.316 0.386 0.594 0.980 3 Joey Votto CIN 0.297 0.388 0.567 0.955 4 Justin Morneau MIN 0.308 0.392 0.563 0.955 5 Vladimir Guerrero TEX 0.309 0.374 0.570 0.943 6 Corey Hart MIL 0.285 0.371 0.570 0.941 7 Albert Pujols STL 0.305 0.381 0.559 0.940 8 Adrian Beltre BOS 0.317 0.405 0.524 0.929 9 Carlos Quentin CHW 0.284 0.376 0.553 0.929 10 Paul Konerko CHW 0.292 0.368 0.553 0.921 11 Josh Hamilton TEX 0.286 0.358 0.557 0.915 12 Andre Ethier LAD 0.301 0.379 0.528 0.907 13 Ian Stewart COL 0.300 0.395 0.512 0.907 14 Torii Hunter LAA 0.311 0.386 0.520 0.907 15 Jose Bautista TOR 0.261 0.351 0.555 0.906 16 Magglio Ordonez DET 0.332 0.386 0.519 0.905 17 David Ortiz BOS 0.264 0.364 0.539 0.903 18 Robinson Cano NYY 0.306 0.386 0.516 0.903 19 Miguel Olivo COL 0.290 0.378 0.521 0.899 20 Vernon Wells TOR 0.288 0.362 0.537 0.899 21 Matt Holliday STL 0.301 0.379 0.517 0.897 22 Adrian Gonzalez SDP 0.288 0.370 0.524 0.894 23 Kevin Youkilis BOS 0.277 0.374 0.519 0.894 24 Adam Dunn WSN 0.255 0.359 0.534 0.894 25 Aubrey Huff SFG 0.292 0.368 0.520 0.888 26 Ryan Howard PHI 0.281 0.375 0.512 0.887 27 Mike Napoli LAA 0.277 0.378 0.507 0.885 28 Brennan Boesch DET 0.294 0.364 0.517 0.881 29 Scott Rolen CIN 0.276 0.360 0.521 0.880 30 Prince Fielder MIL 0.273 0.367 0.512 0.880
Second-half rebounds coming?
Top-30 Under-Performers Player Team OPS PrOPS Diff 1 Yadier Molina STL 0.595 0.744 -0.149 2 Justin Smoak TOT 0.657 0.789 -0.132 3 Adam Lind TOR 0.640 0.768 -0.128 4 Carlos Lee HOU 0.682 0.807 -0.125 5 Hunter Pence HOU 0.743 0.867 -0.124 6 Jose Lopez SEA 0.610 0.732 -0.122 7 Ian Stewart COL 0.788 0.907 -0.119 8 Skip Schumaker STL 0.642 0.761 -0.119 9 Juan Rivera LAA 0.708 0.818 -0.110 10 Carlos Gonzalez COL 0.878 0.984 -0.106 11 Derek Jeter NYY 0.732 0.836 -0.104 12 Pedro Feliz HOU 0.546 0.648 -0.102 13 Cesar Izturis BAL 0.569 0.670 -0.101 14 Aaron Hill TOR 0.631 0.732 -0.101 15 Mike Napoli LAA 0.786 0.885 -0.099 16 Kurt Suzuki OAK 0.716 0.812 -0.096 17 Aramis Ramirez CHC 0.648 0.743 -0.095 18 Todd Helton COL 0.646 0.737 -0.091 19 Aaron Rowand SFG 0.681 0.764 -0.083 20 Alcides Escobar MIL 0.630 0.713 -0.083 21 Orlando Cabrera CIN 0.612 0.690 -0.078 22 Carlos Pena TBR 0.737 0.812 -0.075 23 Russell Martin LAD 0.679 0.752 -0.073 24 Clint Barmes COL 0.721 0.791 -0.070 25 Miguel Tejada BAL 0.691 0.761 -0.070 26 Howie Kendrick LAA 0.708 0.778 -0.070 27 Ty Wigginton BAL 0.768 0.837 -0.069 28 Shane Victorino PHI 0.766 0.835 -0.069 29 Jorge Cantu FLA 0.734 0.798 -0.064 30 Juan Uribe SFG 0.758 0.821 -0.063
Second-half declines on the way?
Top-30 Over-Performers Player Team OPS PrOPS Diff 1 Ian Kinsler TEX 0.831 0.688 0.143 2 Carl Crawford TBR 0.901 0.774 0.127 3 Andres Torres SFG 0.861 0.736 0.125 4 Nick Markakis BAL 0.847 0.726 0.121 5 Brennan Boesch DET 0.990 0.881 0.109 6 Justin Morneau MIN 1.055 0.955 0.100 7 Rafael Furcal LAD 0.898 0.798 0.100 8 Josh Hamilton TEX 1.014 0.915 0.099 9 Evan Longoria TBR 0.895 0.796 0.099 10 David DeJesus KCR 0.855 0.760 0.095 11 Miguel Cabrera DET 1.074 0.980 0.094 12 Fred Lewis TOR 0.779 0.689 0.090 13 Cliff Pennington OAK 0.726 0.637 0.089 14 Kevin Youkilis BOS 0.981 0.894 0.087 15 Jayson Werth PHI 0.881 0.796 0.085 16 Ben Zobrist TBR 0.783 0.699 0.084 17 Ichiro Suzuki SEA 0.785 0.704 0.081 18 Angel Pagan NYM 0.845 0.769 0.076 19 Troy Tulowitzki COL 0.877 0.806 0.071 20 Andrew McCutchen PIT 0.798 0.727 0.071 21 David Wright NYM 0.924 0.854 0.070 22 Billy Butler KCR 0.873 0.805 0.068 23 Adam Dunn WSN 0.959 0.894 0.065 24 Daric Barton OAK 0.772 0.708 0.064 25 Jason Bay NYM 0.779 0.720 0.059 26 Blake DeWitt LAD 0.728 0.670 0.058 27 Kelly Johnson ARI 0.870 0.813 0.057 28 Joey Votto CIN 1.011 0.955 0.056 29 Josh Willingham WSN 0.913 0.857 0.056 30 Lastings Milledge PIT 0.739 0.686 0.053
Buster Onley ($) has a piece this morning in which he discusses the potential free-agent valuePrince Fielder after his agent Scott Boras made some comparisons to Mark Teixeira. Olney points to the Fielder in the living room when making such comparisons, and notes that several MLB insiders feel his weight is going to prevent him from aging as gracefully as most players. Fielder is so heavy that it’s hard to know what to expect. I think he will ultimately be a DH, and this may keep him in the game longer.
Yet despite his weight, which many talent evaluators thought would keep him from excelling at all, he has been an elite and valuable hitter. If he ages like the average players (possibly a dubious assumption, but it’s hard to know what to expect) and signs a five-year deal (equivalent in length to Ryan Howard‘s extension) after the 2011 season, I estimate the value of the deal in total dollars paid out would be $104 million, or a little under $21 million per year. It’s not quite Teixeira money, but it’s in the neighborhood. Concerns about his weight, justified or not, will probably prevent him from signing a deal this long, but I guess we’ll just have to “weight” and see.
Top-30 Over-Performers Rank Player Team OPS PrOPS Diff PA 1 Andres Torres SFG 0.814 0.680 0.134 269 2 Ian Kinsler TEX 0.811 0.684 0.127 244 3 Carl Crawford TBR 0.869 0.742 0.127 322 4 Nick Markakis BAL 0.821 0.699 0.122 340 5 Justin Morneau MIN 1.059 0.938 0.121 327 6 David DeJesus KCR 0.875 0.756 0.119 330 7 Andrew McCutchen PIT 0.825 0.710 0.115 332 8 Josh Hamilton TEX 0.993 0.880 0.113 328 9 Jayson Werth PHI 0.919 0.813 0.106 308 10 Daric Barton OAK 0.798 0.692 0.106 352 11 Kevin Youkilis BOS 0.983 0.878 0.105 322 12 Ichiro Suzuki SEA 0.813 0.716 0.097 351 13 Ben Zobrist TBR 0.797 0.710 0.087 336 14 Franklin Gutierrez SEA 0.767 0.681 0.086 311 15 Lastings Milledge PIT 0.715 0.634 0.081 263 16 Jason Bay NYM 0.812 0.732 0.080 323 17 Fred Lewis TOR 0.774 0.695 0.079 272 18 Brandon Phillips CIN 0.841 0.766 0.075 357 19 Troy Tulowitzki COL 0.877 0.806 0.071 265 20 Evan Longoria TBR 0.870 0.803 0.067 342 21 Colby Rasmus STL 0.921 0.856 0.065 275 22 Miguel Cabrera DET 1.040 0.976 0.064 325 23 Brett Gardner NYY 0.811 0.747 0.064 278 24 Cliff Pennington OAK 0.704 0.644 0.060 296 25 Adam Dunn WSN 0.917 0.858 0.059 327 26 Johnny Damon DET 0.753 0.695 0.058 302 27 Elvis Andrus TEX 0.706 0.649 0.057 344 28 David Wright NYM 0.929 0.874 0.055 338 29 Martin Prado ATL 0.857 0.803 0.054 367 30 Albert Pujols STL 0.989 0.936 0.053 346
Top-30 Under-Performers Rank Player Team OPS PrOPS Diff PA 1 Hunter Pence HOU 0.730 0.876 -0.146 313 2 Ian Stewart COL 0.738 0.866 -0.128 270 3 Yadier Molina STL 0.615 0.742 -0.127 267 4 Carlos Lee HOU 0.669 0.796 -0.127 319 5 Jose Lopez SEA 0.603 0.726 -0.123 325 6 Adam Lind TOR 0.608 0.729 -0.121 322 7 Skip Schumaker STL 0.655 0.768 -0.113 288 8 Justin Smoak TEX 0.697 0.800 -0.103 250 9 Derek Jeter NYY 0.754 0.857 -0.103 361 10 Carlos Gonzalez COL 0.825 0.925 -0.100 301 11 Juan Rivera LAA 0.725 0.820 -0.095 258 12 Pedro Feliz HOU 0.572 0.664 -0.092 255 13 Todd Helton COL 0.657 0.749 -0.092 281 14 Aaron Hill TOR 0.642 0.719 -0.077 287 15 Carlos Pena TBR 0.728 0.804 -0.076 323 16 Clint Barmes COL 0.706 0.781 -0.075 257 17 Mike Napoli LAA 0.838 0.912 -0.074 262 18 Derrek Lee CHC 0.699 0.772 -0.073 334 19 Miguel Tejada BAL 0.695 0.768 -0.073 325 20 Jason Bartlett TBR 0.631 0.702 -0.071 258 21 Alcides Escobar MIL 0.640 0.710 -0.070 282 22 Orlando Cabrera CIN 0.625 0.692 -0.067 337 23 Russell Martin LAD 0.678 0.743 -0.065 300 24 Carlos Quentin CHW 0.784 0.848 -0.064 279 25 Shane Victorino PHI 0.767 0.829 -0.062 346 26 Melky Cabrera ATL 0.653 0.715 -0.062 265 27 Howie Kendrick LAA 0.718 0.779 -0.061 336 28 A.J. Pierzynski CHW 0.651 0.711 -0.060 250 29 Ty Wigginton BAL 0.808 0.865 -0.057 299 30 Mark Teixeira NYY 0.757 0.812 -0.055 354
Edwin Jackson threw a bit of a lame no-hitter on Friday. I’m sorry if it offends you when I call such a hallowed feat lame, but eight walks in a game for a major-league pitcher is bad (see Pulling a Homer). But aside from this, one aspect of his performance has gotten a lot of attention: 149 pitches thrown. This is the highest pitch count allowed in a game since 2005 (see my previous post on how pitch counts have changed over the past two decades).
I have been conducting a study of pitch counts with Sean Forman, and we will be presenting our findings at the upcoming SABR convention in Atlanta. But since it’s applicable to Jackson’s situation, I’ll reveal some of the findings. Our study uses past pitching performances to estimate the impact of pitch counts on future performance, controlling for numerous factors, using fractional polynomial regression analysis to capture potential non-linear relationships. The results indicate that the impact of the pitch count in a single game on the following game is real but small; and, the impact is linear, not increasing as some analysts have theorized.
On average, every pitch thrown raises a pitcher’s ERA by 0.007 in the following game. Jackson’s ERA was 5.05 going into Friday’s game averaging 104 pitches per game; thus, based on the historical response of pitchers to pitch counts Jackson’s expected performance in his next start is about 5.37. So, Jackson can be expected to pitch worse, but not that much worse. Really, it’s not that big of a deal as a one-time event. Should Jackson continue to average around 150 pitches a game, the impact will grow, but I doubt that is going to happen. As for the impact on injuries, we didn’t look into it in this study. However, I have previously found little correlation between pitching loads and injury.
My take: if you have a pitcher going for a no-hitter—not matter how bad he’s pitching—the benefit of the excitement and media coverage of letting a pitcher throw more pitches is probably worth the cost. Let’s stop freaking out about pitch counts until we understand their influence a little better.
Update: In response to Jackson’s high pitch count, the Diamondbacks will push back his next start a day or two. How much will this help him recover? No much. On average, each day of rest lowers a pitcher’s ERA by approximately 0.015. Thus, his expected ERA drops from 5.37 to 5.34 (with two days of extra rest). Why rest days matter so little is an interesting question. A few years ago, I saw an presentation on muscle recovery from exercise, and one of the interesting findings was that most of the healing happens within the first few days. Whether this explains the finding or not, I don’t know.