Was Jose Canseco the Johnny Appleseed of Steroids?

Last week, I became aware of a study by economists Eric Gould and Todd Kaplan that evaluates at the impact of Jose Canseco on his teammates. They examine the belief that Canseco distributed his knowledge about steroids throughout baseball by introducing many of his teammates to performance-enhancing drugs. If this was the case, the authors hypothesize that he ought to have left a trail of improved performance among teammates in his wake.

The authors look at the careers of Canseco’s teammates to investigate this claim. Their method is to examine players to see how well they perform as a Canseco teammate and afterwards, relative to the years preceding involvement with Canseco. The idea is somewhat similar to what I did with my analysis of Leo Mazzone’s impact on pitchers (see Chapter 5 of my book).

After reading the study, I am not convinced by the authors’ conclusions. It’s not just one thing, but a collection of issues that form my opinion. I have problems with both the study’s design and the interpretation of the reported results. My disagreement does not mean that the effect does not exist, only that I do not see a pattern consistent with Canseco spreading steroids to his teammates.

First, I want to start with the sample. The authors look at players from from 1970–2003. I find this an odd range of seasons to select. Canseco’s career spans from 1985–2001. Why start a decade and a half before Canseco enters the league and stop two years after he exits? The asymmetry bothers me largely because the run-scoring environment preceding Canseco was much lower than it was during the latter part of his career. But even without this, it is a strange choice to make. I can only guess that there is some teammate of Canseco’s whose career extends back this far, but I still don’t agree with the choice. And why not extend the sample until the present?

Next, the authors set the cutoffs for examining player performance at 50 at-bats for hitters and 10 games for pitchers. These minimums are far too low even when stats are normalized for playing time, but the impact is much worse when looking at absolute statistics like total home runs, which the authors do. For pitchers—who I will not examine here—it’s possible to get pitchers who pitched very few innings.

The authors also make a strange choice to break hitters into two classes: power and skilled players. The idea is that we might see different effects on the different styles of play. I don’t agree with this, but that is not the weird part. The way they differentiate power and skilled players is by position played, weird but moderately defensible. The power positions are first base, designated hitters, outfielders, and catchers. The skilled positions are second base, shortstop, and third base. And it becomes clear that the authors are not all that familiar with baseball. Catcher is a “power” position? Third base is a skill position? I suspect that the catcher and shortstop positions produce the least offense of all the positions. Sure, you can point to a power-hitting catcher like Mike Piazza, but you can also point to a punchless first basemen like Doug Mientkiewicz, but in general catcher and first base are at the opposite ends of defensive skill with very different offensive expectations. Center field is also a defensive position that should not be lumped in with the corner positions. This highlights the problem of separating power potential by position. And, it’s not so much that the way that the sample spliced—which don’t like—but the fact that it is being spliced at all makes me suspicious.

The choice of dependent variables is also bit strange. While the authors are mainly looking for changes in power, they pick only a few metrics that measure power: HR, SLG, and HR/AB. The other statistics include AVG, RBI, K, BB, IBB, at-bats, fielding percentage, errors, and steals. I have no problem with AVG. RBI is completely useless since it is largely dependent on teammates. K, BB and IBB are chosen because they correlate with home run hitting. But, performance in this area is also correlated with other things such as plate discipline, and the authors are already looking at home runs. This just adds columns to the regression table, that would have been better-used doing robustness checks on the sample and control variables. I would have liked to have seen isolated power (SLG–AVG), HR/H, OBP, and OPS.

As for the control variables, many of the choices are not intuitive. The batting average of the division (subtracting out own-team performance), the manager’s lifetime winning percentage, the batting park factor, years of experience (listed as a continuous variable in the text, but reported as a matrix of dummies in the regression tables), year effects, and dummies for each division. Also, the equation is estimated with fixed effects to control for individual player attributes.

I wouldn’t have chosen some of these same variables, but I don’t think they make much difference. However, I am perplexed by the inclusion of manager’s winning percentage and division dummies. I don’t see any obvious potential bias from the quality of the manager. In any event, managerial dummies are probably the better choice. Mangers with players who perform better will have higher winning percentages, so a positive correlation is to be expected, but the causality is difficult to determine. However, this isn’t a huge issue.

The division dummies make no sense. The divisions changed their compositions at several points during the sample—the most extreme change occurs when a Central Division was added to both leagues in 1994—and there are no common rules or kinds of play that are really unique to any division. If there was such an effect, the batting average of the division and year effects should catch this. It would have made more sense to include league dummies, because of the significant differences in play between the leagues after the introduction of the DH in 1973. In any event, the authors state that the control variables do not alter the results. I would have liked to see some results with different controls.

Now, to the variable(s) of interest. When I initially looked at the study, flipped to the regression tables first and noticed that there did not appear to be a “Canseco effect,” because the estimate on playing with Canseco was not statistically significant. But, that is not what the authors use to quantify Canseco’s impact; we are supposed to look at a second variable that identifies the seasons after playing with Canseco. The intuition is that “even if he did learn steroids from Canseco, we do not know when he learned about it during his time with Canseco, but we can be sure that he already acquired the knowledge after player with Canseco” (p. 10). I just don’t buy this. I understand that it might take a while for the effect to kick in, but this should still manifest itself in the “played with” variable, especially because many players played with Canseco for multiple seasons. At best this story makes sense only for guys who might have played one season with Canseco (more on this below). Second, anabolic steroids work quickly, so it’s unlikely that there would be a delayed effect.

After reading the paper, I came to the conclusion that the results are probably fragile. So, I designed a similar, but not identical, dataset. I did almost everything the authors did, except I did not break the sample into power and skilled players, and I included league dummies instead of division dummies, because I feel this is a superior choice. I also kicked out some partial seasons when guys switched teams to make life easier in developing the dataset. Thus, what I am doing is “replication” in the sense of looking for a similar result in the data, rather than trying to recreate the previous estimates. If the result is real, then I should find something similar. Here is what I found looking at raw home run totals (control variable estimates not reported).

		HR	HR	HR/AB		HR/AB		AR(1)
		50 AB	200 AB	50 AB		200 AB		Corrected
					
With Canseco	-0.297	-0.199	-1.28E-03	-9.39E-04	-0.449
		[0.66]	[0.35]	[1.41]		[0.93]		[0.87]
After Canseco	0.667	0.737	3.49E-04	6.28E-04	-0.204
		[1.58]	[1.34]	[0.41]		[0.65]		[0.34]
					
Observations	15,644	9,234	15,644		9,234		12,759
Players		2,885	1,717	2,885		1,717		2,265
R-squared	0.13	0.14	0.09		0.13		0.08
Absolute value of t statistics in brackets					

The coefficient on for playing with Canseco is negative and insignificant and the after Canseco coefficient is positive with a p-value of 0.12, which is above the standard (0.05) and lenient (0.1) thresholds for statistical significance. That is the best that I could get. When I up the at-bat minimum to the more appropriate 200, normalize home runs for at-bats, and both, “played with” is negative and never significant, and “after’s” p-value is never as low as it was in the specification that most-closely resembles the study. Another potential problem that I encountered was serial correlation in the data. This is sometimes difficult to detect, and it is possible that it is a problem unique to my sample. However, when I correct for the problem, both Canseco variables consistently have high insignificant p-values. So, though the authors find some evidence of an effect in the after variable in their sample, the finding appears not be all that robust.

The one thing that bothers me most about this study is that we have to interpret why the “after Canseco” variable is important, but the “during” variable is not as important. And I think the author’s story really only applies to players who are with Canseco for one season. So, I ran some regressions using players who played with Canseco for only one year.

		One-year	One-year 
				10+ Career
		
With Canseco	-2.656		-3.450
		[3.02]**	[3.17]**
After Canseco	-2.562		-3.027
		[2.84]**	[2.95]**
		
Observations	1,200		940
Players		186		100
R-squared	0.18		0.23
Absolute value of t statistics in brackets		
* significant at 5%; ** significant at 1%		

The effects of during and after playing with Canseco are strongly negative, about 2.5 less homers. However, if they only played on year with him it could reflect that these players were not very good and were on their way out of the league. So, I limited the sample to players with careers of 10 or more seasons; and, the result is a decline in homers of about 3 HRs both with and after.

My point of offering this “replication” isn’t so much to say that my specifications are superior. I just want to show that the findings do not appear to be robust. To concur with the conclusions presented in the study you have to interpret the findings in a way that I do not believe is correct. Upon further examination, I believe the significant effect on home runs after playing with Canseco identified in the Gould and Kaplan study is a product of spurious correlation, and thus this tells us little about Canseco impact on disseminating steroids throughout baseball.

One Response “Was Jose Canseco the Johnny Appleseed of Steroids?”

  1. Zach says:

    Let me preface my comment by admitting that I have not read the study. While I agree with your critique of the study (based on your comments alone), I think that your replication conclusions may provide evidence that steroids, in fact, don’t improve performance rather than showing that Canseco may not have been spreading steroids around MLB.

    If Canseco had been the Johnny Appleseed of steroids but those drugs don’t improve performance (as many experts will argue, especially with regards to HGH), then your results would look as they do. So, the authors’ conclusions provide evidence for both hypotheses: 1) steroids improve performance, and 2) Canseco likely provided them. Another possibility is that there is a “Canseco effect” but not with respect to steroids. Of course, I don’t find this likely. Your results indicate that one or both of the above hypotheses are incorrect, but we can’t say for sure that it’s the latter.

    Having said all that, I agree with you and find it hard to believe that we would find significant aggregate results on player performance during and after Canseco. But I do believe that he had some effect on a subset of teammates and those teammates potentially had small effects on their future teammates – essentially a ripple effect.