## More on Player Aging

Phil Birnbaum has a new theory as to why I’m wrong (I suspect it won’t be his last), and he links to others who think I’ve made the same mistake.

This time, my sample is the problem. By choosing a sample of players from 24 — 35 with a minimum of 10 seasons played and 5,000 plate appearances, this “biases” the estimates because I’ve chosen a sample that excludes people with short careers. To demonstrate this, Birnbaum simulates a new world to show that sample choice can affect estimated average peaks. This is irrelevant to what I have done, and shows a serious lack of understanding of the technique I employed. I’m not taking mean of the sample to calculate a peak, I’m estimating an aging function using a common-yet-sophisticated technique designed to see how changing factors of many units over time affect an outcome. Because of the way the technique works, the sample won’t bias a peak estimate as suggested.

This fact is easily seen in the graphs presented in the paper. Below, I post the aging functions for strikeouts per nine innings and walks per nine innings for pitchers on a single graph. The functions have their peaks (denoted by vertical dotted lines) at the opposite ends of the sample—9 years apart. This is a curious finding that I discuss in the paper.

Why didn’t they center around the middle of the sample, or why weren’t strikeouts biased upwards due to the career requirements? Just because the sample is pared down doesn’t mean that the technique will be biased one way or the other. The people in the sample still age like normal human beings. The technique captures how these individuals age in accordance with the aging process by looking at how players’ performances change. Strikeouts are shown to peak early because the players in the sample strike out fewer batters as they age—the function actually peaks outside the sample range at 23.56. All of this confusion could be cleared up with an introductory econometrics course.

In any case, it’s always possible that estimates might be sensitive to sample selection. These critiques tend to focus on what could be rather than what is. In the paper, I explain the reasoning for my sample choice by highlighting potential selection bias problems. As I said in the comments: “Including a list of players who played 10 years or more allows for the smoothing of random fluctuations over time, because we don’t have to worry about players being dropped in and out of the sample. More importantly, it allows for identifying a career baseline for each player from which we can observe how he progresses. It certainly shouldn’t perform worse than the average-yearly-change method.” I did not make this choice lightly. My cutoffs were chosen because they fit with cutoffs used by other researchers and I tested the models for sensitivity to cutoff decisions.

If you’re not convinced, why don’t we move from the hypothetical to reality. The graph below maps the aging function estimated on the low threshold of 1000 career plate appearance and 300 plate appearances in a season. No more age range, no career-length limit, and a vastly-reduced history of performance. And guess what? Peak age is 29.

JC – Forgive me if you’ve already been asked this question, but what problems do you have with the Delta method? Basically, taking the average change in performance for each player based off of year.

Even in your last graph, you are still introducing selection bias because even 300 plate appearances in each year is still going to weed out the bad hitters, and going to emphasize the guys who have a later decline period.

It’s been accepted that the Delta method is the best way to go, and you haven’t yet convinced me – or apparently, very smart guys like Phil or Tango – that your method is more correct.

Also, could you show the actual data points with the R^2 instead of just the curve?

Link

The study is imperfect. The potential bias that you posit, which I do not think exists, must be balanced against having an adequate sample size for measuring performance. I defend the sample size choice in the paper. If you disagree, I’m sorry.

Sorry.

A scatter plot of many different player-seasons would just be a vast glob of dots with no discernible pattern. That’s why you have to hold constant for individual talent when estimating, and why I reported the function. R2 is .59

OK, I think I see part of the reason for our difference. You are assuming that every player ages with the same trajectory, and players vary only in skill and circumstances. If player A happens to peak in HR at 25 (looking empirically at his actual real-life HR trajectory), but player B happens to peak in HR at 35, then you are assuming that both players “really” peaked at 29, and that both varied from that because of random chance.

Am I correct in my interpretation of your model?

Sorry, in the previous post I should have said “peaked at 29.89,” [not 29]. 29.89 is what your study found as the peak age for HR.

It’s not clear to me what your interpretation is, so I’d have to say no. See the paper I sent you for a description of the empirical method employed.

Sorry … you’re right, I was a bit unclear.

I mean the model does not assume that every player has a random trajectory (and therefore peak age) chosen from some distribution. It assumes that there is one fixed trajectory and peak age (because beta1 and beta2 in equation 1 are not indexed by player).

I was mistakenly thinking that every player can have his own trajectory and therefore his own peak age, and you were estimating the mean of those peak ages. But it looks like your model assumes that peak age is the same for all players, and you’re finding an estimate of that single peak age.

Actually, I think your estimate is perfectly consistent with a peak of 27.

I’ll illustrate for your LWTS estimate of 29.41 years (first column of Table III).

Holding the estimate for age (beta1) constant, and bumping the coefficient for age^2 (beta2) up by 2 standard errors (as obtained from Table III), you get a peak age of about 25. If you bump the coefficient *down* by 2 standard errors, your new estimate for peak age is about 36.

(Using -beta1/(2 * beta2) as stated in the text.)

So not even considering a confidence interval for beta1, using the 95% confidence interval for beta2 gives an interval of (25, 36).

That doesn’t seem like it contradicts the other studies putting peak age at 27.

That is not an appropriate way to manipulate the estimates. I suggest consulting an econometrics textbook to gain a better understanding of multiple regression analysis. A Guide to Econometrics by Kennedy is a good cheap option that is not a traditional textbook. Mostly Harmless Econometrics looks to be another good introduction. I haven’t read it, but I have heard good things about it.

Hi, JC,

Fair enough … those confidence intervals look too wide. Is there a specific section of either (or any) textbook that talks about calculating the standard error of a function of two of the coefficients?

Were you able to calculate a standard error for peak age yourself? It seems like a wild goose chase hunting down textbooks if there’s no obvious way to answer the question.

>”That is not an appropriate way to manipulate the estimates.”

You’re right. Not even close. I should have realized that.

If you do know a way to calculate the SE of the peak age, and you have a reference, I’ll look it up.

The coefficient estimates for age and age

^{2}each have their own standard errors and confidence intervals. The problem for finding the range of the peak is that you can’t move one without moving the other; it’s like hitting a baseball while standing on a skateboard. The estimation technique has provided the “best” estimates to minimize prediction error, which I use to estimates the peak on the function. The confidence interval bounds will lie above and below the best fit curve, and will be similarly shaped. Thus, the peak will be similar. Think up-and-down with the confidence intervals, not left-to-right.Calculating the exact confidence interval around the maximum or minimum from quadratic regression estimates is difficult. I don’t know of a text that discusses this specific issue. If you would like to take a crack at it, here’s a paper that discusses two methods. If you want to see a rough measure how the peak might differ, you could take the extreme bounds for each variable’s confidence interval (positive for age and negative for age

^{2}, and vice versa) the peak age estimates range from 29.28 to 29.5. But, you can’t hold one constant and push the other to its extreme, because that would generate a shape far different from best estimate.Makes sense, thanks!

Well, if it’s been accepted, then clearly it must be superior. No explanation, reasoning or facts necessary once a group of “very smart guys” has been designated to think for the rest of us.

The last curve in your post, the one with lower minimums … is that LWTS? And are you saying the inclusion of the new players caused the peak to drop from 29.4 to 29.0?

Lwts. No drop, I am just rounding.