I was curious about the peak age of performance for baseball players. A Google search revealed a few studies, but they didn’t handle the question the way I would. So I thought I would try it my way. I am not saying those studies are bad, I plan to read several over the weekend.
Using the Lahman database, I used a sample of every player in MLB that had 300 at-bats in a season from 1980-2003. If a player failed to get 300 ABs in a season, he was dropped from the analysis for that season and the season that followed (because I am using lags), but he was then returned when he had 300 ABs. I picked this time period because, I am not interested in aging patterns from the past at this moment. Using Stata I estimated the following equation using the xtregar command (this is basically an OLS estimate with a correction for first-order autocorrelation). The unit of observation is a player in a season.
OPS+ = B1 (Age) + B2 (Age^2) + B3 (Lag of OPS+) + B4 (League OPS for that year) + V (player constants) + e
OPS+ is simple the OPS of a player relative to the average OPS of the league in that year. This measure is NOT park-adjusted. V is a vector of fixed effects to control for individual player attributes. I’ll spare presenting the numerical results for the moment, but I will tell you that the peak age of OPS + for the sample is about 29. Plugging in the average numbers for the Lag and League OPS variables the table below plots the estimated OPS+ by age.
Interesting. The general wisdom on this stuff is that the peak age is closer to 27. I will have to think more on it.
Update: Here are the coefficient estimates. All are statistically significant at the 1% level except League OPS, which is significant at just about the 5% level. I also report a second specification with League OPS dropped.
There is still more to come.