SABR 40
I’ll be attending SABR 40 for the next few days. If you would like to talk, please approach me and introduce yourself. This is my first SABR convention.
I am giving two research presentations, one oral and one poster.
Resting the pitcher: How useful are pitch counts and days of rest? (with Sean Forman): Thursday, August 5, 2:30 – 2:55pm, Georgia 7,8,9
Many individuals believe that limiting pitch counts and increasing days of rest can improve performance and reduce injuries. Though the belief that overuse can hamper pitchers is widespread, there exists little evidence that adjusting pitch counts and rest has much effect on pitcher performance. In this study, Bradbury and Forman use newly available gamelevel pitch count data from 1988 to 2009 to evaluate the impact of pitch counts and rest days on future performance. They discuss their employment of linear and nonlinear multiple regression analysis techniques to estimate the impact of pitch counts — in recent games and cumulatively over a season — and days of rest on pitcher performances while controlling for the effects of other factors.
Here are the presentation slides (pdf). Here is a draft of the paper (revised 10/18/2010).
Putting a dollar sign on the muscle: What are baseball players worth?: I’ll be by the poster on Thursday from 46pm.
In the 1970s, using team revenue and player performance data, Gerald Scully employed the standard marginal revenue product framework frequently used by labor economists to estimate the financial contributions of players. Bradbury’s study employs new information about baseball’s economic structure and sabermetric performance metrics in an updated Scully framework to estimate the dollar value of current major league baseball players. He compares player salaries and estimated worth by service class, presents a method for projecting player worth over the duration of longterm contracts, identifies some of baseball’s best and worst deals, and ranks teams according to their abilities to manage their budgets.
For more see my forthcoming book Hot Stove Economics.
13 Responses “SABR 40”
Trackbacks/Pingbacks

[...] This post was mentioned on Twitter by J.C. Bradbury, J.C. Bradbury. J.C. Bradbury said: SABR 40 http://bit.ly/bRht66 [...]

[...] JC did the oral presentation of the data he investigated with Sean Forman of baseballreference.com JC’s own site is sabernomics.com. He has put the slides and a draft of the paper up on his site if you’d like to see more detail than I can provide in this liveblogging recap. Here. [...]
Great stuff JCwish I could get off work to go see your presentations. You need to correct the URL on the pitch count link presentation though…I corrected the misnamed sabernomics and got it.
Thanks Paul,
Fixed.
Hi, JC (and Sean),
Wouldn’t expected ERA *have* to go up after a highpitch start if you’re controlling for season ERA? A highpitch start is probably above average for the season. Therefore, the rest of the starts MUST be *below* average for that season. So a highpitch start would be followed by a slightly worse start (and viceversa) even if the number of pitches doesn’t do anything at all to the pitcher’s arm.
Taking it the other way: suppose a pitcher has an ERA of 3.00 over 99 innings (33 ER). In one start, he throws only 40 pitches and gives up 6 runs in 2 innings. His ERA over the rest of the season would be 27 ER in 97 innings, for an ERA of 2.51. So, you would definitely expect him to be better than his season 3.00 in his next start, even if the low number of pitches didn’t otherwise affect his arm.
If you replace the “ERA this season” term in the regression by “ERA this season in all other games than this one,” do the results still remain?
P.S. Optional but even better, replace “ERA this season” by “Average ERA by game, in all other games than this one.” That should work better since you’re measuring average game ERA, which is higher than regular ERA since it’s weighted by game and not IP (as evidenced by table 2 where the mean game ERA is 5.64).
No, it certainly doesn’t *have* to go up. I think it would be quite a stretch to expect your reasoning to drive the results. I ran lots of different models in lots of different ways, and the estimates were stable.
JC and Sean. I think that Phil is right in his assertion that if you compare the “next game” to the rest of the season, including the high or low pitch game(s), that it is a biased sample, assuming that the high or low pitch games are good or bad games (which they will be, on the average), and the “next game” will regress to a lower (if following a bad game) or higher (if following a good game) ERA.
The correct way is not to include the biased (high or low pitch) game I think or somehow control for the fact that that game is a high or low ERA game as well.
Now, how much of an effect it has, I don’t know. Since you found a relatively small effect from high or low onegame pitch counts (.007 in ERA per “extra” pitch thrown), I would think that it is possible that the bias could account for that entire effect.
In your response to Phil, are you saying that he is incorrect in his assertion or that the effect of the bias is too small to make any difference in your conclusions – IOW, the effect that Phil describes is substantially less than .007 runs per extra pitch?
Estimating the effect using ERA excluding the game of analysis is doable but problematic. The average for the season is included to proxy for the ability of the pitcher in that season. By not holding it constant for a pitcher for all the games in a season introduces unwanted variance to the model, because the estimation accounts for a fluctuating ERA of a factor we want to remain constant and may dilute the estimated effects of other factors. But, putting that aside, when I estimate the regression with a seasonal ERA taking the game of analysis out, the estimated effect is approximately 0.005. This is less than the model’s nonlinear approximation at the 101st pitch, which is 0.007. However, a difference between the two estimated models is that the new model is linear while the old is not. When I estimate the old as linear (that is including seasonal ERA w/out taking out the game of analysis) the estimate is 0.005.
One method of avoiding the potential bias in the estimate is to estimate ERA outside of the model using an instrumental variable technique. Unfortunately, a method for instrumental variable multiple fractional polynomial estimation hasn’t been developed. One of the important estimation approaches to answering the questions is to not impose a shape on the estimated relationship with an MFP estimation, so that pitches and rest days can be measured nonlinearly. However, because the estimate is virtually linear for most models, I estimated a IVMFP model that proxied ERA in a season from past ERA and age. After doing this the estimated effect of pitches thrown is 0.009.
We could throw variables in and out all day long, choose different estimators, parse the sample, etc. The point of this project is to generate a rough estimate of the impact of pitches thrown on performance. This impact will surely differ across pitchers and situations, whether the impact is 0.005 or 0.009 is of little practical relevance. It is more of a general guide that indicates, pitching less improves future performance and pitching more diminishes future performance.
I wasn’t expecting so little change in the results when you adjusted the ERA. When I adjusted the ERA in my own regression, I got the coefficient for pitches actually changing sign.
I’ll check my work again just to make sure.
To confirm: when you say you removed the “game of analysis,” from the ERA, you mean the highpitch game and not the subsequent game, right?
JC,
When I look at this from 20052009 (with ERA adjusted as Phil suggests) I cannot disprove the null hypothesis that pitch counts have no effect on subsequent performance. Is it possible for you to rerun the results with this smaller (but arguably most relevant for today’s game) dataset and see if your results still hold?
Is this material in the new book? If so, are you able to revise that content before publication?
It’s hard to comment without seeing your results. If you’d like to share, send me an email.
Following up … I verified a couple of things, and, as far as I can tell, everything checks out. Not sure why I got different results than you did.
Phil,
I will reassess what I have done. This is still an ongoing research project. I may not be able to satisfy you, but I want to be sure that the ERA issue is not clouding the estimates. So, for now, consider the above results preliminary. I’ll provide an updated version of the paper when it’s done.