Archive for Pitching
What Edwin Jackson’s Pitch Count Hath Wrought
Edwin Jackson threw a bit of a lame no-hitter on Friday. I’m sorry if it offends you when I call such a hallowed feat lame, but eight walks in a game for a major-league pitcher is bad (see Pulling a Homer). But aside from this, one aspect of his performance has gotten a lot of attention: 149 pitches thrown. This is the highest pitch count allowed in a game since 2005 (see my previous post on how pitch counts have changed over the past two decades).
I have been conducting a study of pitch counts with Sean Forman, and we will be presenting our findings at the upcoming SABR convention in Atlanta. But since it’s applicable to Jackson’s situation, I’ll reveal some of the findings. Our study uses past pitching performances to estimate the impact of pitch counts on future performance, controlling for numerous factors, using fractional polynomial regression analysis to capture potential non-linear relationships. The results indicate that the impact of the pitch count in a single game on the following game is real but small; and, the impact is linear, not increasing as some analysts have theorized.
On average, every pitch thrown raises a pitcher’s ERA by 0.007 in the following game. Jackson’s ERA was 5.05 going into Friday’s game averaging 104 pitches per game; thus, based on the historical response of pitchers to pitch counts Jackson’s expected performance in his next start is about 5.37. So, Jackson can be expected to pitch worse, but not that much worse. Really, it’s not that big of a deal as a one-time event. Should Jackson continue to average around 150 pitches a game, the impact will grow, but I doubt that is going to happen. As for the impact on injuries, we didn’t look into it in this study. However, I have previously found little correlation between pitching loads and injury.
My take: if you have a pitcher going for a no-hitter—not matter how bad he’s pitching—the benefit of the excitement and media coverage of letting a pitcher throw more pitches is probably worth the cost. Let’s stop freaking out about pitch counts until we understand their influence a little better.
Update: In response to Jackson’s high pitch count, the Diamondbacks will push back his next start a day or two. How much will this help him recover? No much. On average, each day of rest lowers a pitcher’s ERA by approximately 0.015. Thus, his expected ERA drops from 5.37 to 5.34 (with two days of extra rest). Why rest days matter so little is an interesting question. A few years ago, I saw an presentation on muscle recovery from exercise, and one of the interesting findings was that most of the healing happens within the first few days. Whether this explains the finding or not, I don’t know.
How Have Pitch Counts Changed in the Past 20 Years?
Last week I asked the following question:
From 1988 to 2009, by how many pitches did the median number of pitches thrown in a game by starters change?
The answer: the median number of pitches declined by one, falling from 100 to 99. As the box plot below shows, the median has remained close to constant over the past two decades. The line in the middle of each box marks the median, the edges of the box mark the 25th–75th percentile range, and the whiskers mark the 5th–95th percentile range. If you are wondering how the mean changed, it declined from 97.4 to 96.5.

Does this means that despite all the lip service paid to pitch limits teams aren’t paying any more attention to pitch counts than they used to? Not at all. The average may have stayed the same, but the extremes have fallen on the high and low sides. Pitchers aren’t just throwing fewer long outings, they are also pitching fewer short outings. The diagram below graphs the maximum pitches thrown in a game by year, and it shows a significant drop.

So, congratulations to Dan for correctly guessing the decline. Dan wins a copy of Stumbling on Wins by David Berri and Martin Schmidt.
More Testing of the Verducci Effect
After doing my analysis of the Verducci Effect yesterday, I became aware of Jeremy Greenhouse’s analysis on the subject. He uses a different method, but also finds little support for the Verducci Effect. His analysis pointed me to Josh Hermsmeyer’s Free Player Injury Database, which is valuable new resource. The database contains injury information dating back to the 2002 season. Because the Verducci Effect is largely about predicting injuries I wanted to see how player workloads predicted time on the Disabled List (DL). If significantly increasing pitcher workloads raises the incidence of future injuries, then pitchers who meet Verducci’s criteria should be more likely to get injured.
The table below lists the estimates for the impact of the Verducci Effect on DL stints. I estimated several models (including continuous estimates of pitcher workload), but I report only four specifications below because the results are consistent with the unreported estimates. I looked at the number of days on the DL (continuous) and whether or not a player ended up on the DL (discrete) using random-effects estimation models, least-squares for the former and logit for the latter. I also included the number of days on the DL in the preceding seasons in two specifications to control for the natural injury propensity of players.
DL Days DL Days On DL On DL Verducci 4.27 -1.89 0.28 0.06 [0.76] [0.59] [0.66] [0.12] Mean IP -0.19 -0.16 -0.006 -0.003 [9.33]** [12.75]** [4.69]** [2.01]* DL Days (t-1) 0.64 0.10 [54.16]** [14.06]** Constant 37.67 29.48 -0.50 -1.68 [14.77]** [18.60]** [3.23]** [8.69]** Observations 1428 1428 1428 1428 Overall R2 0.04 0.63 -- -- Absolute value of z statistics in brackets * significant at 5%; ** significant at 1%
Again, the results do not support the existence of the Verducci Effect. The estimates are small and not statistically significant. A change in workload by more than 30 innings for pitchers under 26 is not associated with more days on or trips to the DL. I would like to reiterate that there needs to be further testing of Verducci Effect, but so far there doesn’t appear to be much empirical support for the hypothesis.
Testing the Verducci Effect
For some reason, the Verducci Effect seems to be getting a lot of attention right now. I recall it being mentioned in the past, but I haven’t paid much attention to it. The effect is named for Sports Illustrated writer Tom Verducci, who came up with the concept but didn’t pick the name. Verducci uses a set of criteria to identify pitchers who are at risk for injury due to a significant increase in workload. He describes the criteria for selection and rationale in an article published this week.
More than a decade ago, with the help of then-Oakland pitching coach Rick Peterson, I began tracking one element of overuse which seemed entirely avoidable: working young pitchers too much too soon. Pitchers not yet fully conditioned and physically matured were at risk if clubs asked them to pitch far more innings than they did the previous season — like asking a 10K runner to crank out a marathon. The task wasn’t impossible, but the after-effects were debilitating. I defined an at-risk pitcher as any 25-and-under pitcher who increased his innings log by more than 30 in a year in which he pitched in the big leagues. Each year the breakdown rate of such red-flagged pitchers — either by injury or drop in performance — was staggering.
I figured now would be as good a time as any to put off the other important things I should be doing in order to find out if the Verducci Effect is real. I used a sample of major-league pitchers from 1998–2007 to estimate the impact of ratcheting up pitching loads on performance on innings pitched and era, using both their recent major-league and minor-league workloads to predict performance. In some specifications I included the average between the present and past seasons’ performances (Mean IP or mean ERA) to peg a typical performance level for each pitcher. The Verducci Effect was considered to be in force if a pitcher was under 26 had increased his workload by more than 30 innings in the previous year. I also measured the Verducci Effect continuously using the actual number of innings pitched increased before the preceding season. I only looked at performance in the majors, but minor-league workload totals counted toward the Verducci Effect. I estimated the impact using a random-effects estimation technique that controlled for detected serial correlation. The regression estimates are below, but if you’re not familiar with reading such tables you can skip over them and read my write-up that follows.
IP Change IP Change IP Change IP Change Verducci 19.07 22.17 [3.18]** [3.73]** IP Change * Under 26 0.23 0.21 [3.37]** [3.15]** IP Change -0.25 -0.17 [10.41]** [7.22]** Under 26 14.89 17.04 [4.46]** [5.29]** Mean IP 0.06 0.13 [3.96]** [6.98]** Constant -12.23 -4.83 -21.97 -6.61 [5.78]** [4.90]** [8.83]** [5.98]** Observations 2383 2383 2316 2316 Overall R2 0.0122 0.0058 0.0379 0.0257 Absolute value of z statistics in brackets * significant at 5%; ** significant at 1%
ERA Change ERA Change ERA Change ERA Change Verducci -0.09600 -0.10295 [0.21] [0.22] IP Change * Under 26 -0.00391 -0.00386 [0.78] [0.77] IP Change 0.00611 0.00609 [3.71]** [3.74]** Under 26 -0.24738 -0.25085 [0.93] [0.95] Mean IP 0.47554 0.00684 [13.67]** [0.17] Constant -1.90261 0.49064 0.36013 0.39538 [8.05]** [2.98]** [1.50] [2.86]** Observations 2380 2380 2313 2313 Overall R2 0.0707 0.0000 0.0034 0.0038 Absolute value of z statistics in brackets * significant at 5%; ** significant at 1%
The first row of each table measures the straight-up Verducci effect. If you increased your workload by more than 30 innings in the preceding season and are under the age of 26, then we should expect to see a decline in innings pitched and ERA. However, it turns out that this is not the case. In terms of workload, Verducci Effect pitchers actually increased their innings pitched between 19 to 22 innings. In terms of performance quality, pitcher ERAs declined by an average of 0.1 runs; however, the effect was not statistically significant, which means it’s probably best to say there is no effect.
The last two columns of the tables represent attempts to quantify the Verducci effect as a continuous phenomenon; that is, the more your workload increases the stronger the effect ought to be. To do this I used three variables: the change in workload (measured by innings pitched), an indicator of whether or not the player was under 26, and an interaction term that multiplies the change in workload times the under 26 indicator. The interaction term (listed on the second row of each table) captures any difference in performance from workload by Verducci Effect pitchers. For innings pitched, Verducci Effect pitchers increased the number of innings pitched by about 7 innings for every 30 innings pitched. In addition, being under 26 increased expected innings by 15 innings, while the change in workload tended to lower innings pitched for all pitchers by about 8 innings. Thus, the net result for an under 26 pitcher increasing his workload by 30 innings is an increase of about 7 innings pitched. Note these results are all statistically significant, but this was not the case for ERA.
So, where are we? The results do not bode well for the Verducci Effect. Pitchers who were predicted to decline actually improved. One potential problem with this study is that pitchers who pitched no innings at all in a season were not included; however, I think this bias is slight since this number is small, as even injured pitchers normally get in a few innings every season. Frankly, this is about as quick and dirty as you can get with a test; but, it’s a starting point, and I’d like to see others examine the effect further. While appreciate the intuition behind the Verducci Effect, I don’t see much evidence for it.
Tim Hudson’s Hometown Discount
The long-awaited announcement of Tim Hudson’s new contract with the Braves has finally come. The terms guarantee Hudson $9 million a year over the next three seasons, plus a $1 million buyout of a team option for a fourth year. The fourth-year option also pays out $9 million, so the total value that could be paid out is $36 million over four years. The contract voids a $12 million option for 2010, that the Braves were likely going to buy out for $1 million.
Hudson is an interesting player. He’s ranged from good to dominant. He was really pitching some of his best baseball as a Brave right before his injury. The good news is that he pitched well in his return through 42 innings. With a full offseason to recover, I think there is good reason to believe that he will be back to normal; however, the injury risk may have reduced his value somewhat. I proceed to my valuation with this caveat.
If Hudson pitches as he did in 2007 and 2008 over the course of a full season, then he’ll be worth about $12.5 million per year over the next three seasons. Thus, it appears that Hudson is giving the hometown discount that he promised—smart move by Frank Wren and the Braves. This allows the Braves to trade one of its other starters (who will it be?) and still have pitching stability going into the future.
If you see Hudson out and about in the Atlanta area, be sure to say “thanks”—but, please, don’t pester him. Or, maybe throw a little support to the Hudson Family Foundation. He wants to be in Atlanta, and he has strengthened his club by doing so. It’s nice to have you on board for the long haul, Tim.
Question of the Day
If the Yankees end up losing the World Series because they can’t get good production out of a starter for the final three games, how will this affect the machismo argument regarding pitcher rest?
Even if the Phillies come up short, Charlie Manuel made the right call to give his pitchers four days of rest. It’s an issue of physiology: the body needs time to recover from strenuous activity.
The Duncan Effect
At Inside Pulse Sports Pip of Fungoes asks:
it’s time for a comprehensive study of whether there is a “Duncan Effect” on pitchers, like the one that JC Bradbury did on Leo Mazzone. Until then, no one knows for certain what kind of an impact (if any) Duncan has on pitchers.
Well, because you ask so nicely, I’d be happy to oblige.
Actually, it’s easy because I already did the study.
Two years ago, Sports Illustrated asked me to look into the question, and I ran a study similar to the one I did for Leo Mazzone in The Baseball Economist. I looked at how pitchers performed with and without Duncan, controlling for factors such as age, parks, and pitcher quality. I found that Dave Duncan’s pitchers improved their ERAs by about 0.35 runs—yeah, he’s pretty darn good.
If you haven’t seen this before, it’s because the estimate buried on page 60 of the September 27, 2007 issue of SI in a story about Duncan and his sons. I meant to write about it at the time, but I never got around to it.
UPDATE: Pitchers Hit Eighth posts the link to the SI article in the comments. Thanks!
Does Clutch Pitching Exist?
As a follow-up to my little clutch hitting study, I thought it would be interesting to look at clutch pitching using the same methodology. Though I don’t believe there is good reason to expect clutch performance among hitters, I think it’s plausible that pitchers may have some clutch skill. Pitchers have to regulate their effort throughout the game and often change the way they pitch with runners on base (employing the stretch). Theses factors leave room for pitchers to perform differently when the stakes of the game change. Pitching better with runners in scoring position (RISP) may not be “clutch” in the Platonic sense of rising to the occasion, but it’s a skill worthy of examination.
I looked at individual RISP plate appearances in 1992 and estimated the impact of past clutch performance controlling for the overall pitcher performance in each area (allowed AVG, OBP, SLG, strikeout rate, walk rate, home run rate), the skill of the batter in each area, and the platoon effect (platoon = 1; 0 = otherwise). I used RISP performance in 1989–1991 to proxy clutch ability—if pitchers have clutch skill, past clutch performance should correlate with present clutch performance.
The table below lists the coefficients (reported as marginal effects) and robust z-statistics of regression estimates in seven performance areas. I used the probit method to estimate binary outcomes (outcome = 1 if an event occurred, and 0 otherwise) of individual plate appearances for hits, on-base (hits + walks + hbp), strikeouts, walks, non-intentional walks, and home runs. I used the negative binomial method to estimate the impact of the variables on the number of total bases resulting from a plate appearance.
Hit On Base TB K BB BB-IBB HR Overall 1.1801 0.8815 0.9702 0.8695 0.7657 0.6890 0.5808 [8.61] [7.20] [8.06 ] [12.34] [8.78] [8.66] [6.31] RISP -0.0239 -0.0022 -0.1192 0.0845 0.1305 0.0797 -0.0383 [0.29] [0.03] [-1.69] [1.39] [2.16] [1.43] [0.73] Batter 1.0148 1.0737 0.9816 0.9242 1.1108 0.9558 0.5831 [13.61] [17.42] [15.62] [25.56] [23.54] [22.49] [16.13] Platoon 0.0165 0.0332 0.0428 -0.0246 0.0266 0.0039 0.0071 [2.77] [5.35] [4.29 ] [5.30] [6.73] [2.79] [1.96] Obs. 21,096 23,872 21,096 23,872 23,872 23,872 23,329 Probit Probit Neg Bin Probit Probit Probit Probit
Past RISP performance is not a statistically significant predictor of 1992 RISP performance. Walk rate appears to be an exception—with pitchers consistently performing worse with runners on base (and having a z-stat > 2)—but the higher probability of walks seems to be caused by the increase in intentional walks issued with the hope of turn out a double play. When IBBs are removed, pitcher RISP walk performance loses its statistical significance.
The results do hide one thing: pitchers perform better in RISP than non-RISP situations, except when walks are involved. The table below shows the average of outcomes for all events. All differences are statistically significant.
No RISP RISP Hit 0.255 0.249 TB 0.380 0.368 BB 0.075 0.089 On Base 0.315 0.321 K 0.150 0.157 HR 0.021 0.019
The numbers remove intentional walks, therefore the worse performance in preventing walks, which also shows up for on-base probability, could represent “intentional unintentional-walks” or pitchers losing control a bit when runners are in scoring position. But if the latter were true, I would expect the numbers to be worse in the other areas. Also, because the numbers below are the percentages of all outcomes, the better numbers in RISP may also reflect better relievers entering the game for such situations.
The main story here is that the regression estimates indicate that after controlling for several relevant factors pitchers don’t appear to have any special skill over other pitchers in performing in RSIP situations. A pitcher’s overall performance level does a fine just of predicting performance, and knowing past clutch performance doesn’t appear to add useful information.
Rebunking “Wins”
I happened to run across a BTF link to a post at Rays Index that I had to read because of its curious conclusion.
In fact, in the absence of other stats, Wins is a very good, if not great, indicator of a pitcher’s value. So next time you hear somebody say Wins is a crappy way to evaluate a pitcher, throw a drink in their face and then make them read this post.
As someone who would need a towel if readers followed this advice, I believe a response is in order. Now, the author Cork Gaines (“The Professor”) does acknowledge that Wins is not the best statistic to use for evaluating pitchers, but that’s not really news. When ever is there a situation when anyone is going to have to choose using Wins or nothing to value a pitcher? After reading the post, I maintain that Wins is a poor statistic to use for valuing pitchers. In fact, the statistical evidence used in the article shows the opposite of what the author thinks it shows.
Gaines uses regression estimates of Wins and Win% on ERA+, finding R2 values of 0.51 and 0.54 to justify the usefulness of Wins.* Those values are indeed statistically significant and reveal a real positive correlation between Wins and run prevention. But more so, they reveal why Wins are such a bad statistic to use for valuing pitching quality. How is showing that good pitchers get more Wins than bad pitchers busting a myth? Greg Maddux didn’t luck his way to 355 Wins, and no one who pooh-poohs Wins thinks his Win total is a result of randomness, unrelated to his ability. It’s the magnitude of the correlation that is important here.
The R2 reveals the percentage of the change in the dependent variable (ERA+ in this example) explained by changes in the independent variable(s) (Wins or Wins%). The remainder is due to explanatory factors not included in the model. Now, R2 can be tricky to interpret and it is sensitive to sample size; but, in general, the results indicate that 50% of the difference in ERA+ across pitchers can be explained by differences in Wins. That’s the problem, not evidence to the contrary. The main knock against Wins is that pitchers have control over only one half of the game: half the game is defense (50%) and the other half is offense (50%). An R2 of close to 0.50 confirms rather than debunks this notion.
When choosing performance metrics, it is important to use three criteria:
1) How well does it correlate with output? — Wins doesn’t do so bad here: Wins are correlated with run prevention. Still, other metrics of pitcher performance are far superior, and the life-boat circumstances when someone might need Wins to value a pitcher don’t happen. Why bring this up? No one has suggested that Wins and ability are uncorrelated.
2) How well does it measure ability? — It measures ability, but it is heavily polluted by outside factors (offense and fielding). This is the criterion used to justify using DIPs over ERA. If you want to know the statistic that most strongly correlates with run prevention for pitchers, it’s ERA by a longshot. It is almost a pure recording of the runs pitchers give up, so of course the correlation will be strong. The problem is that pitchers themselves don’t have much control over a major component of ERA: balls that are put into play. ERA fluctuates significantly from season to season for pitchers because it is so dependent on balls in play. DIPS measures are preferred over ERA because they more accurately capture actual pitcher contributions to run prevention, not because they correlate more strongly with run prevention. Similarly, Wins capture some aspects of pitcher ability, but a huge chunk of the contribution is determined by something beyond pitcher control. And the regression estimates that the explained variance of ERA+ are consistent with Wins reflecting half of what pitchers contribute to generating this metric.
3) How well does it match our intuition as to what matters? — -This criterion isn’t all that relevant in this case, and is reflected in the analysis in criterion 2. I use this rule in situations where correlations yield counterintuitive values. For example, strikeouts and home-run hitting are positively correlated; however, suggesting that a hitter should strike out more to increase his power would be wrong.
Gaines is right that Wins includes some useful information regarding pitchers, but the pollution impacts of outside factors are so large that in cases where we see Wins deviate from ERA or DIPs performance expectations that it is Wins that contains the misleading information. There is no reason to use Wins to evaluate pitcher ability. It is neither a very good nor great indicator of a pitcher’s value.
* A footnote to the article states that R2 ranges from -1 to 1 with greater positive (negative) values indicating a stronger correlation. This is incorrect. R2 ranges from 0 to 1. I was curious if the author was using a correlation coefficient R, which does range from -1 to 1 but has a different interpretation in terms of measuring explained variance. However, the graphs and intuition make it look as though the descriptive footnote is incorrect, not the main text of the analysis.
This Is Getting Ridiculous
Kerry Wood was in Cleveland on Wednesday night to take the physical needed before he can finalize a two-year deal with the Indians worth about $20 million.
$10 million a year for Kerry Wood?! And I thought the K-Rod contract was excessive. Kerry Wood was a more valuable pitcher ($5.72 million) than Francisco Rodriguez ($4.62 million) in 2008, but he was limited by injuries during the four previous seasons. Assuming that Wood pitches exactly as he pitched in 2008, I estimate he will be worth approximately $6.5 million per season for the next two years. I reiterate: this assumes that one of the most injury-prone players in the league performs as he did last year. He made $4.2 million pitching for the Cubs last season, and the Cubs didn’t even offer him arbitration. Given his injury history, I think that was probably the right call.
I consider the Indians to be one of the smartest organizations in baseball—in my book I rate Cleveland to be the best-managed franchise in the American League—therefore, this move shocks me even more. What could be going on? I can think of only one explanation: Kerry Wood has been able to demonstrate such good health that teams think he can start. If Kerry Wood can get back to his 2002–2003 form he would be a $14 million/year pitcher over the next two seasons. Maybe his agent has been shopping his potential as a starter.

(Forthcoming October 2010)




