Archive for Sabermetrics
Athletes were involved as customers in an illicit steroid distribution network that led authorities to raid two Orlando pharmacies and arrest four company officials, a New York prosecutor said.
Albany County (N.Y.) District Attorney P. David Soares refused to identify any steroid recipients, saying prosecutors were focused on producers and distributors.
Customers include Los Angeles Angels outfielder Gary Matthews Jr., according to the Times Union of Albany, which first disclosed the investigation, citing unidentified sources. Matthews would not answer specific questions about the story Wednesday.
Gary Matthews, Jr. makes a perfect villain. He is a major league player who had a career year in 2006, which allowed the former bench player to nab a 5-year $50 million contract with the LA Angels of Anaheim.
This just makes too much sense not to be true, right? Finally we have a reason to explain sudden excellence from a mediocre player—so mediocre that the Braves once cut him to keep Dewayne Wise. The career .755 OPS player put up an .866 OPS in 2006. Aha, now it all makes sense!
To all this I say, “can’t an OK player enjoy his success in peace?” Gary Matthews’s 2006 season was mostly a product of luck. Here are his OPS for the past five seasons.
Yes, 2006 seems like an outlier, but so does 2003. If you throw out the good and bad years his average OPS was .778. But wait, the guy hit .866 and is linked with performance-enhancing drugs! Here is where the role of chance comes in. Gary Matthews was one of the luckiest hitters in baseball last year.
According to PrOPS, which predicts a players OPS based on the way a player hit the ball without relying on in-play outcomes, his OPS for 2006 should have been .792. This is hardly different than his .778 average, and certainly not out of step with what he had achieved with the Rangers during the previous two seasons. In fact, only four other players in the American League were luckier than Matthews. If his improvement was the result of performance-enhancing drugs his PrOPS should have improved as well. PrOPS has no knowledge of Matthews’s light-hitting history. It generates a prediction from past hitting performances of players who hit the ball similarly.
The role of luck can also be seen in some other stats, too. He hit 2 more homers in 2006 than in 2005 in 145 more plate appearances—his homer rate actually declined. His isolated power (SLG-AVG) in 2005 was .181, compared to .182 in 2006. He didn’t get a PED power boost. Where he did improve was with his batting average on balls in play, where luck can accumulate. His BABIP was up to .349 compared to the .283 of 2005.
This does not clear Gary Matthews of using PEDs, but it doesn’t look like he got much of a boost if he did use them. Who knows, maybe he would have had a .700 OPS season without them. But, I don’t want to see a rush to judgment on the player just because he had one fluke year. The important point is that the stats don’t convict him at all. If anything, they exonerate him. However, if I am an Angels fans, I am concerned for other reasons.
I often get e-mails from readers who are interested in working in an MLB front office. Well, here’s your chance to break in. Farhan Zaidi, an economist who works in the Oakland A’s front office e-mails me this exciting opportunity.
We just posted an internship listing on our website for someone to help us during draft season (April to June). I think it’s an excellent opportunity. I’m emailing a few bloggers and site administrators about the listing, hoping they can put it in a quick plug for it and encourage people to apply. The link and full listing are below. Any mention you could make of it on your site would be helpful.
Baseball Operations Intern: 1 position
April – June
The Baseball Operations department is seeking an intern for the Spring 2007 season. The Intern will report to the Assistant General Manager and Baseball Operations Analyst. Primary duties will include but are not limited to the following:
– Assisting with data collection and analysis projects
– Research and report on all potential player personnel decisions, including Amateur Draft
– Game-charting and report generation from game-charting programs
– Prepare staff for organizational meetings
– Complete specialized projects as assigned
– Qualified applicants must be motivated, well organized, detail-oriented, and be able to work independently and on a deadline.
– Candidates must be proficient in all Microsoft Office programs (especially Excel).
– Proficiency in statistical packages, such as Stata, SAS, and SPSS is a plus.
– Knowledge of scripting languages (Perl, Python) for screen-scraping programming is a major plus.
– Background in math and statistics is preferred.
– Candidates must be within commutable distance of our offices in Oakland for the duration of the internship.
Interested applicants should send a cover letter and resume to Intern Coordinator, 7000 Coliseum Way, Oakland CA 94621, or via fax 510-563-2397, or email email@example.com* by March 1, 2007. Please clearly detail your technical and programming skills in your resume. No phone calls, please.
I think “excellent opportunity” is an understatement. Even if this job isn’t a good fit for you now, please take a close look at the qualifications: Excel, Stata, SAS, SPSS, Perl, Python, math, and statistics. That should speak volumes about what you need to get a leg up on the competition if you want to work in MLB.
Steve Treder has a nice piece on the hit batter explosion of the modern era.
Thus today’s situation is fascinating in several regards. The incidence of hit batsmen in major league baseball has dramatically increased in the past couple of decades; a significant transformation has taken place in the very nature of the game. Yet this transformation has caught little notice, engaging neither broad contemplation nor comprehensive understanding.
Regular readers of Sabernomics know that I am fascinated by the topic. I will just list one link to a recent post on the subject. It would probably be easier to Google search for hit batters on this blog.
If you want to read more, a paper of mine on the relationship between hit batters and the designated hitter (co-authored with Doug Drinen), Crime and Punishment in Major League Baseball: The Case of the Designated Hitter and Hit Batters, has just been published by Economic Inquiry. Also, Chapter 1 of my book, “Accidents Happen…but More So in American League,” summarizes much of our research. And Chapter 8, “The Evolution of Baseball Talent,” discusses the impact of the distribution of talent on hit batters.
Update: David Pinto provides some support for the talent dilution hypothesis.
About a year ago, I published an article on my PrOPS system in The Hardball Times Baseball Annual 2006. While the article did get some positive media attention (here, here), I occasionally run across skeptical comments from baseball fans. Some people feel the system hasn’t been tested, but that’s incorrect. The fact is, the PrOPS formula is derived from observed correlations in past data. And I reported the results of how well the formula captures over/under performance in the article.
There is a highly statistically significant relationship…between a player’s over/under performance and his decline/improvement. And the greater the the deviation between PrOPS and OPS, the larger the reversion is the following season. For every 0.01 increase/decrease in a player’s over/under performance, his OPS is likely to fall/rise by 0.008 the following season. For example, a player with an OPS 10 “points” above his PrOPS, can expect his OPS to fall by eight points in the following season. That is quite a reversion.
I also generated lists of the top-25 over and under performing season from 2002-2004. And what happened to them?
Of the top 25 over performers, 20 players had lower OPS in the following season.
Of the top 25 under performers, 21 improved their OPS in the following season.
The article also lists the top-25 over and under performers for 2005. What happened to those players in 2006?
Of the over performers, 12 players declined, 7 improved, and 6 did not deviate more than 20 OPS-points from the previous season. Of the under performers, 11 players improved, 7 declined, 3 had no change, and 5 didn’t garner serious playing time. It’s not an air-tight projection system, but there seems to be some information there.
PrOPS is not a stand-alone projection tool. You should not look only at a player’s PrOPS and assume it’s exactly what the player should be doing. When I look at it, I also consider the player’s recent hitting history, injuries, aging, and all that other stuff we sometimes use to evaluate hitters. But when I see a player have a career year, and his PrOPS don’t show it, I start to get suspicious.
If you’re curious about the over/under performers of 2006, see The Hardball Times.
You ought to take a couse in econometrics. You’ll get theory and empirical tools, which are necessary. But, that’s not practical for most, so why not teach yourself by reading one or more of the following.
Studenmund: Understanding Econometrics – A mostly verbal introduction to econometrics by a baseball fan. Well-written and not intimidating.
Kennedy: A Guide to Econometrics — A quick and practical guide that is affordable.
Wooldridge: Introductory Econometrics – My favorite intro book.
Stock and Watson: Introduction to Econometrics — I like this book a good bit.
A college buddy of mine used to always be the butt of the joke whenever having his picture taken. “Stand up, Rob” someone would always say. And you see that was supposed to be funny because he wasn’t sitting down, just shorter than everyone else. (You’re right, that’s not very funny.) I’m sure Marcus Giles has heard it during every team picture he’s posed for, because it’s a really old joke. Though Marcus is short in stature, he has a reputation for carrying a big bat, that is, until this year. For the previous three seasons (all as a full-time player) he’s posted OPS of .921, .821, and .826; not bad for a second basemen.
His 2006 season hasn’t gone so well. After getting off to a slow start, he’s posted an OPS of .756. Early on, some Braves fans attributed his drop-off to going off steroids (jerks). Others suggested the premature birth of his second child during spring training slowed him down. I have my own theory: Marcus is the exactly the same. Here are Marcus’s last four seasons in OPS and PrOPS.
Year OPS PrOPS 2003 .911 .825 2004 .819 .774 2005 .831 .750 2006 .756 .776 Mean .829 .781
Marcus has been a bit hit-unlucky in 2006, but his OPS is closer to his PrOPS lines of the previous seasons than his previous OPS. Also interesting is that prior to the 2006 season, I projected OPS for all MLB players. The model predicted Giles would hit .776. It’s partly an eerie coincidence that his PrOPS is exactly what the model predicted, but the point is that though Marcus Giles is a decent offensive second baseman, he hasn’t been as good as his numbers. Furthermore, he’s hitting the ball this year the same as he always has, and the on-field results reflect this even though he’s hit a little better than his numbers indicate.
What does this mean for PrOPS? Not much. I’m just screwing around with the numbers of one individual. The model isn’t nailing every player. However, when I tested the model on past seasons, it predicted reasonably well.
What does this mean for Marcus Giles? Don’t ask him to stand up, he already is.
Addendum: Jeff at SwingTraining notes that Marcus is literally standing up more now than he used to.
My first attempt to look at the compensation of lefties in the big leagues focused solely on hitters. I found, contrary to findings in “the real world,” that lefty hitters earned about a quarter of a million dollars less than equally skilled right-handed batters. However, there are few problems with the analysis; the biggest one being that I only used hitting stats and lefties don’t play a few positions of high defensive importance. I can think of some ways to control for this, but all of them are a royal pain. Instead, let’s look at pitchers.
Just like in the analysis of hitters, I include only pure left and right-handed players—no switch hitters or players who throw and bat with opposite hands. I estimate the impact of pitching performance (estimated through strikeouts, walks, home runs, and innings pitched), age, and handedness on yearly salary for free agent eligible pitchers. I used two samples: more than 100 innings pitched and less than 100 but with a minimum of 30 innings pitched. This should roughly capture starters and relievers. I care less about that starter/reliever designation than I do about getting an adequate sample size.
The results are interesting. For the starters sample, lefties earn about $233,000 more than equally skilled right-handed pitchers. This fits with the findings from the general work force. Again, the relationship is not statistically significant, but it’s close, with a t-statistic of 1.55 (p-value of 0.12). This is about 7.5%, which is about half as large as the effect found in the general labor force. I find it interesting that you often hear left starters described as “crafty.” Maybe there is something to it. These guys are deceptive and smart, and have higher opportunity costs outside of baseball. I’d be curious to see the handedness of pitchers turned TV commentators, scouts, instructors, etc.
For relievers, the findings are nearly the mirror image, and the estimate is statistically significant. Southpaw relievers earn about $209,000 less than equally skilled right-handed pitchers, which is similar to what I found for position players. Hmm, maybe this has to do with LOOGY duties of lefty relievers. Because many left-handed relievers are used only against other lefties there are a lot of marginal pitchers who hang around. This may reduce the bargaining strength of each other because teams can just say, “look, if you don’t sign this contract I’m just going to bring up some random kid from triple-A or sign Mike Remlinger.” And because there are very few ROOGYs, marginal righties are more likely to find a real job if they are on the margin. OK folks, this is what is called a stretch, but it seems somewhat plausible.
I post the results below. Feel free to add comments.
Starters Var. Coef. T-stat K9 $408,358 7.29 BB9 -$294,948 -3.78 HR9 -$865,822 -3.65 Age $371,375 1.59 Age2 -$4,417 -1.27 IP $8,410 5.37 Lefty $230,654 1.57 R2 0.47 Relievers Var. Coef. T-stat K9 $219,657 6.92 BB9 -$186,071 -3.74 HR9 $269,649 1.81 Age $645,741 2.52 Age2 -$9,295 -2.43 IP $8,468 2.41 Lefty -$209,337 -1.98 R2 0.25 (Constant and year effects not reported)
There have been a few blog posts recently on a new NBER article Handedness and Earnings (Everyday Economics, Marginal Revolution, and Greg Mankiw). The general results are reported in The Washington Post.
“Among the college-educated men in our sample, those who report being left-handed earn 13 percent more than those who report being right-handed,” said economist Christopher S. Ruebeck of Lafayette College. Ruebeck and his research partners, Joseph E. Harrington Jr. and Robert Moffitt of Johns Hopkins University, reported the findings in a new working paper published by the National Bureau of Economic Research.
However, it’s interesting that there is no concrete theory as to why this gap exists, and why it occurs in men but not women.
While evidence of a wage gap was unequivocal, explanations for the disparity proved more elusive. Differences in biology and brain function are two possibilities. Nor do the researchers know why they didn’t see a similar effect among women.
Tyler Cowen hypothesizes that there is something special about being left-handed, which I think fits with the conventional wisdom.
Left-handers have idiosyncrasies, obsessions, and downright insanities which lift some of them into productivity heaven.
But I’m curious. Left-handedness is not something traditionally viewed positively by society. The Latin term for left is sinister. Are modern day lefties overcoming a past bias and now surpassing righties?
Anyway, I wanted to see how left-handedness plays out in baseball. Because handedness plays a role in the success of players—because of the platoon effect— I had to be careful to control for other characteristics.
So, I went to the trusty Lahman Baseball Database and looked at all batters from 1985-2005. I estimated the impact of hitting performance, age, and handedness on yearly earnings of free agent eligible players using multiple linear regression. Also, I used a sample of only pure left-handed and right-handed players, taking out all switch hitters and players who throw and bat with opposite hands. Here is what I found.
Left-handers earn about $225,000 less than right-handers, but the difference is not statistically significant, meaning the estimate is within the expected variation. However, that’s nearly 6% less than the average player in this sample, which is nothing to sneeze at. But, more importantly, there does not seem to be evidence of lefties earning more than righties among hitters. This certainly isn’t perfect, and I can think of a few problems. The main weakness is that lefties are barred from playing four defensive positions: catcher, third, shortstop, and second. Because I’m using only offensive stats, equally good-hitting lefties may not be as valuable as righties. I’ll post the results below if you’re interested. I’m not sure what it means about why left-handers earn more in the general workforce, but this doesn’t seem to hold for baseball hitters. Maybe I’ll do pitchers next.
Variable Coefficient T-stat AVG $10,800,000 4.21 Walk Rate $13,300,000 6.17 Iso-Power $11,600,000 9.07 Left-Handed -$225,676 -1.35 Age -$297,196 -0.81 Age2 $4,337 0.79 R2 .51 (year effects and constant not reported)
Addendum: See Part 2 for analysis of pitchers.
I’ve been looking forward to Friday for a long time: the O’s come to Atlanta for the first time since Leo Mazzone took his magic bag to Baltimore. I’d planned to go to at least one game, but it looks like I won’t make it. Since I published my study on Leo’s effectiveness last year at The Baseball Analysts, I was curious to see how he and Braves pitchers would do away from each other. The interesting development of Russ Ortiz making his first start under Mazzone against the Braves on Saturday is going to put Mazzone in the spotlight even more.
It’s very hard to get a sense of what has really happened since Mazzone left. For one, there’s the sample size issue. There are so many factors involved in ERA differences across teams, and ERA is a statistic that varies widely; this requires a larger sample than half-a-season. Second, the two teams play in different leagues with very different pitchers. Straight comparisons of ERAs would tell us very little even if we had an adequate sample.
One way to look at this is to compare the pitchers Leo coached last year on the Braves to their performances this year. I looked at this in this post, and found that those pitchers were doing considerably worse this year. Although, I didn’t look at the Orioles.
Another way to look at the issue is compare how the overall staffs of the O’s and Braves are doing this year versus last year. While both teams have experienced some turnover, there are many constants on both teams. So, here is a second way to look at the teams. The table shows the differences in ERAs between this season and last season, with a few corrections.
BAL ATL Difference 2006 (Raw) 5.18 4.67 0.51 2006 (RC) 4.96 4.37 0.59 2006 (LC) 4.76 4.37 0.39 2005 (Raw) 4.57 3.99 0.58 2005 (LC) 4.37 3.99 3.38 Difference 0.37 0.38 -0.01
The first row is the raw ERA of both teams in 2006. The second subtracts the difference in runs between this year and last year, as both leagues are yielding more earned runs than last year. The third row corrects for the differences in ERA between leagues, by subtracting the average difference in ERAs between leagues from the Orioles (I could have added it to the Braves). The fourth row lists the 2005 ERAs of both clubs, and the fifth corrects for the difference between leagues. The last row reports the difference between the roughly-corrected 2006 ERAs to the 2005 (LC) ERAs.
As everybody knows, both clubs’ pitching staffs have not done well this year, and their fortunes have been quite similar. What does this show? I have no idea, probably not much at all. It’s likely that both clubs are struggling to adapt to new pitching coaches.
It’s also interesting to look at the ERAs over time. As of late the O’s and Braves have been moving in opposite directions.
Month O's Braves April 5.54 4.56 May 5.54 4.48 June 4.43 4.98
The O’s have done much better in June, but is this a random fluctuation or a sign of things to come? It’s interesting to note that back in April, Orioles manager Sam Perlozzo suggested June as the time Mazzone’s influence would show.
All along Orioles Manager Sam Perlozzo has tried to temper people’s expectations about Mazzone. He wouldn’t work miracles right away, warned Perlozzo, who believes that by June people will see tangible evidence of why Mazzone is considered the best pitching coach in baseball.
“I don’t think that’s unreasonable,” Perlozzo said. “Right now with many of the guys being away at the [World Baseball] Classic they’re learning on the job right now. You have certain habits you’re used to. And it takes practice. I think you’ll see [Mazzone’s effect] later on.”
Obviously, it’s too early to tell what is going on, but I watch with great anticipation.
Addendum: I’m going to link to stories discussing Mazzone’s return here. If you see others, I’d appreciate it if you would please pass them along.
Russell Adams discusses extracting luck from baseball statistics in this weekend’s Wall Street Journal.
By tallying minute details about every hit ball, statistics gurus say they can compute how much of a player’s accomplishments stem from random factors. The results could affect which free agents should get top dollar after a great season and suggest which teams are likely to hold up in the pennant race.
For J.C. Bradbury, economics professor at the University of the South in Sewanee, Tenn., the desire to understand the role of chance in baseball started with a slump. In 2004, Atlanta Braves star Chipper Jones’s hitting numbers were suffering. Even though Mr. Jones appeared to be hitting the ball well, he wasn’t getting on base or hitting for power at his normal rate.
So Prof. Bradbury looked for statistics that would isolate the hitter’s role in each at-bat — and exclude the performance of the pitcher and fielders. The starting point was a mound of data on the characteristics of each ball put into play by Mr. Jones. Based on historical data on balls hit in similar ways, Prof. Bradbury estimated what should have happened, statistically speaking, in Mr. Jones’s at bats.
In this approach, a ball hit on a trajectory that would typically send it past the fielder for a base hit counted as a hit for Mr. Jones regardless of whether he was called out in real life. Turns out, there were many such instances for Mr. Jones. Prof. Bradbury concluded the Braves star was suffering from a run of bad luck — an indication he was likely to perform significantly better the next season. Indeed, in 2005, Mr. Jones’s batting average bounced back to .296 from .248. “I was quite surprised at how well it predicted player performance,” says Prof. Bradbury.
This topic interests me very much, and think Russell does an excellent job of covering the topic. I’d like to thank him for including me in the article.