Since my original post on the relationship between temperature and home runs I had wanted to follow up with game-specific data. This was slowed by my inability to get good 2008 data and going out of town. After banging my head against the wall trying to use Perl to parse MLB Gameday data I did what I should have done in the first place: I asked my buddy Doug Drinen for help. Though he doesn’t like intentional walks, he is a Perl grand master, and he parsed the 2008 data in a matter of minutes. I want to offer him a huge thanks for doing this.
So, I left off last time looking at average US temperatures in April and home-run rates. What I really needed was game temperatures. So, with the help of Retrosheet, Gameday, and Doug, I was able to look at home runs and temperature by game. The data below lists the mean home runs per game and temperatures in April by year. The temperature data excludes games played indoors.
MLB Year HR/G Outdoor Temp 2000 2.59 63.12 2001 2.34 63.19 2002 1.92 62.73 2003 2.10 60.87 2004 2.16 63.97 2005 1.91 63.46 2006 2.30 64.45 2007 1.85 60.59 2008 1.78 62.76
While temperatures were down in 2008, they were higher than they were in 2007. 2006, which had a high home-run rate, was hotter than 2007 and 2008. Some commentators have compared this season’s decline in homers to 2006 and have concluded that the decline in steroid use is a big contributor. Here is a sample from Thomas Boswell.
This spring, for the second straight year, home run totals, like the game’s conspicuous muscles, have shrunk dramatically. Last season’s 8 percent drop in home runs was welcomed, but with caution. Would the tater barrage simply resume? But now, in the wake of the Mitchell report, home runs have fallen this spring by another 10.4 percent.
Suddenly, a sport that produced 5,386 home runs in 2006 is on pace for 4,442 this year — a 17.5 percent drop, or a loss of almost 1,000 home runs in just two seasons.
2006 is an odd year for comparison, because serious testing really began in 2005, with suspensions for one failed test. (Check out MLB’s drug policy timeline.) The lost 1,000 homers is a good headline, but the number of homers in 2006 actually hurts the case that testing has lowered steroid use because it occurs after testing began.
But even though 2006 was hot, and 2006 and 2007 were relatively cool, the change in temperature isn’t enough to explain the difference. Comparing the averages from 2000-2004 (pre-testing) to 2005-2008 (testing), the decline in temperature only explains a small portion of the change in home runs.
Years HR/G Outdoor Temp 00-04 2.22 62.77 05-08 1.81 61.67 Diff. 0.41 1.10 Impact of temp on ΔHR/G: -0.0165 %Δ expl by temp: 4.06%
I used a negative binomial regression to estimate the number of home runs in a game as a function of temperature, league, and park (using park dummies) from 2000–2007 for outdoor games. (The first person to say “why don’t you control for X, Y, or Z” gets a maple bat shoved up his/her rear end. It’s a simple model, but it gets the job done. If you don’t like it, estimate your own damn regression.) The model estimates that each one-degree change in temperature adds approximately 0.015 home runs per game. The magnitude of the effect is meaningful: the difference between average April and July temperatures adds about one home run per four games played. However, in explaining a shift in temperature from the pre-testing and testing eras (0.41 HR/G), it only explains four percent of the decline in homers. Even looking at the extreme April temperature difference between 2006 and 2007, the temperature change only explains about 6% of the decline.
Does this mean that steroid testing is the cause of the fall in home runs? No. These numbers bounce around quite a bit, and it’s just too soon to say whether or not we are seeing a real change or whether this is just a product of random fluctuation or other factors. Just look at 2002 and 2006. What we can say is that while low temperatures are contributing to the drop in homers, they don’t explain much of the change.
I’ve got more coming on this, but I’m short on time. I may not get to it until next week, but I have also looked into the impact of temperature on the difference in home runs between leagues.