Archive for March, 2005
Thanks for all of the e-mails with questions and suggestions for the SSPS. The fact that the system projected a reasonable bounceback for Chipper is very interesting. So, I went and took a closer look.
Chipper was really unlucky on balls in play last year. The effect of the batting average on balls in play on a player’s overall batting average is huge. It explains close to two-thirds of the variance in batting average; however, it explains next to nothing about the players batting average the following year. For example, if Chipper had batted .300 on balls in play last year (instead of .246) his projected batting average would rise by a only 5 points. Chipper’s walk-rate and homer-rate, and extra-base-hit-rate were 1.75, 1.6, and 1 standard deviations above the 2004 average for players in the sample. His K-rate was average. His BABIP was 1.4 SDs below the sample average (also well below his career average of .311) but it just didn’t matter in the projection. This explains why his batting average was so bad last year but projects to be good this year. All of his other stats are on par with his career averages. His homers and extra-base hits were up a bit, but not by much. It’s just that guys who strikeout, walk, and homer like Chipper did last year, usually have a much higher batting averages.
One very odd prediction that a reader noticed is that Ichiro, a career .340 hitter coming off a career high .372 AVG, projects to hit .292. So, let’s look at the his projected (actual) batting averages for the past 3 years: .292 (.321), .292 (.312), and .285 (.372). The SPSS is not doing so hot, but that’s not surprising given the uniqueness of Ichiro. His speed had to be an asset in terms of reaching base on balls in play at a higher rate than most players, which my system does not reward. This is because the correlation from year-to-year is tiny so it is not assigned much weight in the regression. Ichiro averaged a BABIP of about .360 (with a whopping .400 in 2004) compared to a league average of about .300. So, this is one of the things I need to work on.
In other news, I’ve about outgrown Blogger. It served me well in the beginning, but I need something a little more powerful now. All of my band-aids are about to fall off. I’m considering a switch to Word Press. Anyone have any pros/cons to pass along?
Well, I didn’t start out trying to do this, but have developed a projection system. Last week, lost in an attempt to solve some question that I have since forgotten, I accidentally came close to generating projections for both hitters and pitchers based solely on player statistics from the previous season. I decided to put a little more effort in to complete it for hitters. Pitchers will come later. While my method is simple, I was surprised at the predictive power that it appears to have employing data only from the previous seasons of hitters. Here are the factors I used to generate the predictions:
Home run rate
Extra-base-hits per hit
Batting average on balls in play
I also controlled for the age, league, and park of the player. I used these factors in some form or another to estimate the batting average, on-base percentage, and isolated power for each player. From these I generated SLG and OPS. Now, this system is very simple.
I have only projected seasons for players with a minimum of 150 plate appearances in 2004.
I assume that every player is playing for the same team as last year (adjusting for new teams is a pain, I may do this later).
I project players who stayed on the same team for the entire season, because it’s a pain to look at split seasons (sorry, no Beltran projection).
I do not look at minor league stats.
I only project the big-4 stats: AVG, OBP, SLG, and OPS.
I would like to fix all of these problems, but time is scarce right now. Hopefully, soon I can make changes at a latet time.
So, how does the model predict? Well, using player seasons from 1998-2004 the predicted OPS explains about 50% of the variance of the actual OPS. The root-mean-squared-error (RMSE) is 0.086, which means two-thirds of the observations lie within 86 OPS “points” of the prediction. In the 2004 Baseball Prospectus, Nate Silver compares six projection systems for 2003. The systems ranged from explaining 42-50% of the variance of OPS with RMSEs from 0.085-0.098. When I apply my system to 2003 it explains 53% of the variance with an RMSE of 0.086. In fairness to these other systems, I am only looking at players with more than 150 PAs in the major leagues, not minor leaguers, foreign players, or previously injured players. But, I think that it is interesting that the system seems to be projecting similar numbers.
One of the most obvious weaknesses of the system looks to actually be a strength, which I did not expect to find. looking only at one previous season seems to tell us a lot about a player, even a player who deviates from his career norm. Look at Chipper Jones’s 2004 actual, 2005 projected, and career lines:
Year AVG OBP SLG OPS 2004 .248 .362 .485 .847 2005 .289 .386 .543 .928 Career .304 .401 .537 .937
As Braves fans know, Chipper had a horrible year last year based on his career stats, which we assumed was a product of his injury. But, now I’m not so sure he didn’t just have some bad luck. The system I employed has no way of knowing that Chipper was a career .937 OPS guy. But, from the information that I included from last year, and knowing his age and park, it concluded Chipper would do 80 points better in 2005. In fact, PECOTA (which I will not post here) projects Chipper’s numbers to be worse than mine, even though it includes Chipper’s past good performances. Now, that doesn’t mean PECOTA is wrong, but it is interesting that the system I developed has projected such a huge jump for a 33 year-old player that is consistent with his career output.
Anyway, here are the 2005 SSPS projections. If you have any thoughts or suggestions, please feel free to pass them along to me. I may or may not make updates, but I hope to post pitcher projections shortly.
The Sons of Sam Horn recently posted a very interesting interview with Bill James. You should definitely read the whole thing. The interviewer, James T., asked James about the influence of his undergraduate economics major on his sabermetrics research.
My economics training was very useful, yes. It had tremendous impact on me, but I have difficulty explaining how. Economics is fundamentally concerned with value…. And my work is fundamentally concerned with value…. So the ways of thinking about problems are often very much the same.
He then elaborates on the contribution of his college education to his problem-solving skills.
Of those other 100 ways to think about the problem, maybe 20 were shown to me by statistics or math professors, and maybe 15 were shown to me by psychologists, and maybe 15 were shown to me by historians, but probably 50 were explained to me by economists. So. . .yes, my way of thinking about the problems was very, very different after I finished school than before I started it, point a, and, point b, the economics classes had a great deal to do with that.
I was kind of surprised to see James give so much credit to economics. Certainly, I am a strong believer in the power of economic thinking, and I am delighted that James agrees somewhat. Although, I have to say that James’s own intellect and inquisitive nature are far more responsible for his success than any economics course. There have been millions of students who’ve taken college courses in economics, but we only have one Bill James.
It was a year ago today that I introduced Sabernomics. I’ve had an exciting year blogging about economics and baseball, and I look forward to another year. According to my site meter, I’ve had over 75,000 visits to the site since I started. I want to thank all of you who visit frequently and comment on what I have to say.
Freakonomics: A Rogue Economist Explores the Hidden Side of Everything
by Steven Levitt and Stephen Dubner.
Thank goodness Stephen Dubner and Steven Levitt met. As the preface to the book explains, Levitt first met Dubner, a New York based writer, for an article in the New York Times Magazine. Levitt had just received the prestigious John Bates Clark Medal, which the American Economics Association awards every other year to the best economist under 40, for his innovative and unique contributions to the discipline. The article, which appeared in August of 2003, introduced the mainstream to an economist who was doing things that were interesting and new to both economists and laymen. There was clearly an interest in what Levitt had to say, and the book publishers were calling; however, Levitt’s opportunity cost was quite high. Already the editor of the Journal of Political Economy in addition to his academic research, it just wasn’t worth Levitt’s time to write the book that needed to be written. But luckily, Levitt’s friendship with Dubner, led to Freakonomics.
I had some high expectations when I first cracked the spine, which is always a bad thing for me. I rarely see movies, because any movie that interests me enough to see it has already raised my expectations so high that I am almost always disappointed. But, Freakonomics exceeded my expectations. This book is more than a dumbed down version of Levitt’s academic work. Even economists familiar with Levitt’s work will learn new things and stay interested in the book. And though Levitt’s unique approach normally involves statistical work, readers can easily grasp the gist of his empirical methods and comprehend the results presented. The writing is good enough for beach reading with a beer. It’s simply a good read no matter what your knowledge of economics.
So what’s in the book? Well, the reader will find several very frank discussions about crime, corruption, inside information, poverty, parenting, and race. As a new parent, I found the parenting chapters the most interesting. This book is politically incorrect, but not in the sense that it’s topics and insinuations will offend to modern liberals; it’s offensive to all facets of the conventional wisdom that get in the way of the truth. If you’re a high society type who gets offended when sensitive topics enter public conversations, stay away. This book is not for intellectual snobs, but social scientists in search of truth, no matter how ugly the truth may be. Here’s a list of some of the book’s assertions:
- Teachers sometimes cheat to promote lagging students in order to save their own hides.
- Doctors and real estate agents don’t necessarily have the best interests of their patients and clients in mind, and possess a power of intimidation similar to the KKK in its heyday.
- We get to see the business structure of a urban street gang, where we learn street thugs have a lot in common with aspiring athletes and actors, and we get to meet the “Johnny Appleseed” of crack cocaine.
- The abortion of “unwanted” children following Roe. v. Wade explains much of the drop in violent crime in the 1990s.
- The educational success of a child is largely determined by the education, wealth, and health of parents; while divorce, having a stay-at-home-mom, engaging in “enlightened” activities, spanking, and watching TV have almost no impact.
- Poor parents imitate the child names given by rich parents to gain cache, and the rich just as quickly choose other names to avoid the loss of cache. But, luckily names don’t seem to impact success in life.
I’ve been a Levitt fan for some time due to his taking economic thinking to the extreme (eXtreme-Economics would have been a good alternate title). Levitt seems to have held onto the things in economics that encouraged every PhD economist to go to graduate school, but is only in the back of our minds by the time we leave. Sure, we know enough fun examples to inspire a few young minds to keep our dissertation advisors employed, but our research is typically as bland as toasted white bread. In his Principles of Economics the great neo-classical economists Alfred Marshall wrote:
The economist needs the three great intellectual faculties, perception, imagination and reason: and most of all he needs imagination, to put him on the track of those causes of visible events which are remote or lie below the surface, and of those effects of visible causes which are remote or lie below the surface.
No economist has taken Marshall’s advice so literally. Levitt’s imagination, the first faculty beaten out of most economics graduate students, has been the key to his becoming the most innovative economist since…well, maybe ever. The message of Freakonomics is twofold: the economic method is a powerful intellectual tool and economics is fun. This book is a small step towards shedding the economics’ inappropriate nickname as “the dismal science.” Economists are rock stars, not Ben Stein’s Ferris Bueller stereotype.
Though my review is certainly a positive one, no review is complete without some criticisms, which are minor. The snippets from Dubner’s original article on Levitt that start each chapter are unnecessary. Since Levitt is a coauthor of the book, it’s distracting to be reminded of how brilliant this guy is. I would have preferred to have the entire article included in the book as an opening chapter. And, where is the discussion of Levitt’s work on sports? Levitt has written on hit-batters in baseball, penalty kicks in soccer, and referees in hockey; topics that would clearly have fit in this book. Lastly, I think the book is too short, and it just ends when I’m ready for more. I guess that’s the price we pay for getting a book from Levitt so soon in his career. So, maybe this is less of a criticism and speaks more to the excellent quality of the material presented. I’d rather be left wanting more than struggling through to the end. In any event, the purchase of this book will yield sufficient consumer surplus and no buyers remorse.
I am today’s guest columnist over at Baseball Analysts. In the post I return to my evaluation of Leo Mazzone as a pitching coach. This time I look at Leo’s impact on starters and relievers. Feel free add your thoughts and suggestions in the comments section at BA, or send me an e-mail.
Thanks to Rich and Bryan for asking me to write for them this week.
I just stumbled across Sean Forman’s excellent webpage with his 2005 projections for the NCAA tournament. Get a jump on everyone at the office who thinks Clark Kellog has the inside scoop. Good stuff.
Sorry to be so distant lately. I’ve been involved in quite a few projects, and the blog has suffered. However, one of the projects I have been working on ought to be of great interest to Sabernomics readers. It should be up by the end of the week.
Here are some random items that might be of interest to you.
David Pinto has launched Baseball Musings Day By Day Database. Have you ever wondered how your favorite player performed in any game since 1972? Well, Dave Pinto has created a tool to help you in your quest. Stop by for a visit. Also, if you enjoy Pinto’s Musings as much as I do drop a small donation in the tip jar. I have never hit an online tip-jar in my life except for David’s site. Baseball Musings is the best general interest baseball blog on the web, no doubt about it. David’s opportunity cost for running the blog is quite high, so give him a little to help keep his great site going. I suspect that David may start selling ads on player pages like Baseball-Reference. If so, I’ll buy one.
Ben Jacobs ranks the Braves 5th in Major League offseason moves over at The Hardball Times.
Overall, the Braves lost a lot of players, but once again gave themselves a great chance to still be playing in October. And although they traded away some top prospects, they also picked up a couple draft picks by offering arbitration to Wright.
Just today I received an advance review copy of Freakonomics: A Rogue Economist Explores the Hidden Side of Everything by Steven Levitt and Stephen Dubner. Steven Levitt is probably one of the most interesting people on the planet. I plan to devour the book and post a review shortly. Here’s jacket-flap plug from Kurt Andersen.
Freakonomics is politically incorrect in the best, most essential way. Levitt and Dubner suss out all kinds of surprising truths — sometimes important ones, sometimes merely fascinating ones — by means of smart, deep, rigorous, open-minded consideration of the facts, with a fearless disregard for whom they might be upsetting. This is bracing fun of the highest order.
With a lot of big names posting their favorite players growing up over at Baseball Analysts I thought I would get in on the action. Without a doubt, my favorite player was Dale Murphy. To a Braves fan born in the early 1970s, Murphy was about the only thing good about the Braves for so many years. You had to like the guy. While the Braves may have been 20 games out of the race, I could always follow Murphy in the home run race in the newspaper. My family didn’t have cable television growing up, so whenever I was in Atlanta visiting grandparents I glued myself to the TV to watch the Braves on WTBS. (I followed much of the 1991 and 1992 seasons on the radio.) Both of my grandfathers were big Braves fans, which is where I first got the itch. My uncle Andy gave me a Braves poster that hung over my bed for most of my childhood. It featured Murphy, Bob Horner, and Glenn Hubbard. Horner was to baseball as Chris Farley was to comedy: fat guy goes boom. Hubbard was, well I just remember his being featured on that poster turning a double play, clearly valued for his defense and his constant presence in the Braves lineup.
I think I liked Murphy so much not just because he was good, but because the Braves sucked so bad while he played for them. He just looked so much better than everyone else. I mean what was the point of having this guy on the team if you were going to be so bad? And just as Don Mattingly left the Yankees right before things got better, the same happened to Murphy. Bravesnation feels a little guilt over this. I think he should be in the Hall, though I’m not convinced that it’s a bias-free belief. Maybe I’ll take some time this year to make the case.
The other player for whom I have fond memories is Jim Thome. Thome played for the AAA Charlotte Knights in 1993, so I got to watch him play when I was home from college. This was a monster team, which featured Sam Horn batting behind Thome. Manny Ramirez even made a brief appearance on the team. I think Thome hit a home run at every game I went to. It became a familiar experience to hear the crack of the bat, see the ball sail over the double-wall into a Fort Mill pasture, and Kenny Loggins “Danger Zone” blaring as he circled the bases. At the same time I was enjoying a very good Knights season, I got to see the Richmond Braves come to town with all of their future stars. In one of the games I saw Chipper Jones, Javy Lopez, Ryan Klesko, and Tony Tarasco all hit home runs.
Anyway, those are just some baseball memories of mine that I felt like sharing.