Rebunking “Wins”

I happened to run across a BTF link to a post at Rays Index that I had to read because of its curious conclusion.

In fact, in the absence of other stats, Wins is a very good, if not great, indicator of a pitcher’s value. So next time you hear somebody say Wins is a crappy way to evaluate a pitcher, throw a drink in their face and then make them read this post.

As someone who would need a towel if readers followed this advice, I believe a response is in order. Now, the author Cork Gaines (“The Professor”) does acknowledge that Wins is not the best statistic to use for evaluating pitchers, but that’s not really news. When ever is there a situation when anyone is going to have to choose using Wins or nothing to value a pitcher? After reading the post, I maintain that Wins is a poor statistic to use for valuing pitchers. In fact, the statistical evidence used in the article shows the opposite of what the author thinks it shows.

Gaines uses regression estimates of Wins and Win% on ERA+, finding R2 values of 0.51 and 0.54 to justify the usefulness of Wins.* Those values are indeed statistically significant and reveal a real positive correlation between Wins and run prevention. But more so, they reveal why Wins are such a bad statistic to use for valuing pitching quality. How is showing that good pitchers get more Wins than bad pitchers busting a myth? Greg Maddux didn’t luck his way to 355 Wins, and no one who pooh-poohs Wins thinks his Win total is a result of randomness, unrelated to his ability. It’s the magnitude of the correlation that is important here.

The R2 reveals the percentage of the change in the dependent variable (ERA+ in this example) explained by changes in the independent variable(s) (Wins or Wins%). The remainder is due to explanatory factors not included in the model. Now, R2 can be tricky to interpret and it is sensitive to sample size; but, in general, the results indicate that 50% of the difference in ERA+ across pitchers can be explained by differences in Wins. That’s the problem, not evidence to the contrary. The main knock against Wins is that pitchers have control over only one half of the game: half the game is defense (50%) and the other half is offense (50%). An R2 of close to 0.50 confirms rather than debunks this notion.

When choosing performance metrics, it is important to use three criteria:
1) How well does it correlate with output? — Wins doesn’t do so bad here: Wins are correlated with run prevention. Still, other metrics of pitcher performance are far superior, and the life-boat circumstances when someone might need Wins to value a pitcher don’t happen. Why bring this up? No one has suggested that Wins and ability are uncorrelated.

2) How well does it measure ability? — It measures ability, but it is heavily polluted by outside factors (offense and fielding). This is the criterion used to justify using DIPs over ERA. If you want to know the statistic that most strongly correlates with run prevention for pitchers, it’s ERA by a longshot. It is almost a pure recording of the runs pitchers give up, so of course the correlation will be strong. The problem is that pitchers themselves don’t have much control over a major component of ERA: balls that are put into play. ERA fluctuates significantly from season to season for pitchers because it is so dependent on balls in play. DIPS measures are preferred over ERA because they more accurately capture actual pitcher contributions to run prevention, not because they correlate more strongly with run prevention. Similarly, Wins capture some aspects of pitcher ability, but a huge chunk of the contribution is determined by something beyond pitcher control. And the regression estimates that the explained variance of ERA+ are consistent with Wins reflecting half of what pitchers contribute to generating this metric.

3) How well does it match our intuition as to what matters? — -This criterion isn’t all that relevant in this case, and is reflected in the analysis in criterion 2. I use this rule in situations where correlations yield counterintuitive values. For example, strikeouts and home-run hitting are positively correlated; however, suggesting that a hitter should strike out more to increase his power would be wrong.

Gaines is right that Wins includes some useful information regarding pitchers, but the pollution impacts of outside factors are so large that in cases where we see Wins deviate from ERA or DIPs performance expectations that it is Wins that contains the misleading information. There is no reason to use Wins to evaluate pitcher ability. It is neither a very good nor great indicator of a pitcher’s value.

* A footnote to the article states that R2 ranges from -1 to 1 with greater positive (negative) values indicating a stronger correlation. This is incorrect. R2 ranges from 0 to 1. I was curious if the author was using a correlation coefficient R, which does range from -1 to 1 but has a different interpretation in terms of measuring explained variance. However, the graphs and intuition make it look as though the descriptive footnote is incorrect, not the main text of the analysis.

Question of the Day

This was originally a Facebook/Twitter post, but I think it’s an appropriate blog post as well:

If the Braves make the playoffs, does Kelly Johnson make the post-season roster?

Yesterday, Bobby Cox used Greg Norton, Brooks Conrad, and Omar Infante as pinch hitters. Reid Gorecki and Ryan Church—whom I assumed was too hurt to play when he didn’t pinch hit, but I guess not—got in the game on defense (RG also pinch ran). And despite the addition of Clint Sammons as the third catcher, David Ross also sat out. Matt Diaz and Adam LaRoche are having their at-bats reduced by hitting at the bottom of the order, while Garret Anderson hits in the five-hole. I don’t get this use of roster resources.

My conclusion: I think it’s unlikely that Kelly Johnson makes a post-season roster.

Minor League Market Power

It turns out that you don’t have to meet owner demands for a new stadium to host a minor-league baseball team.

The sign outside The Diamond still proclaims “Home of the Richmond Braves,” but Richmond and its 24-year-old stadium will have a new professional baseball team beginning next spring.

After months of delays, the long-anticipated relocation of the Class AA Connecticut Defenders was announced yesterday by regional and team officials who gathered at The Diamond in front of a banner that read, “Play Ball!”

“We’re here to say baseball is back,” Richmond Mayor Dwight C. Jones said.

Richmond has been without pro baseball since the Class AAA Richmond Braves moved to a new \$64 million stadium in Gwinnett County, Ga., after the 2008 season.

The Atlanta Braves severed their 43-year relationship with Richmond after growing frustrated by the outdated condition of The Diamond and the lack of progress on a long-term stadium plan.

Though there’s still no plan or even a timetable to get one, owners of the Defenders franchise said they’re ecstatic to come to Richmond and plan to spend at least \$1.5 million to upgrade The Diamond by opening day in April.

Explaining Frenchy’s Attitude

From Technology Review.

The puzzle about overconfidence is its ubiquity. Many studies have shown that most people have an exaggerated sense of their own capabilities, an illusion of control over events and an invulnerability to risk. Most people, for example, believe they are above average drivers, a statistical impossibility. We are all overconfident in one way or another.

But how can such a condition have evolved when the consequences of overconfidence can lead to the destruction of communities and the catastrophic loss of life?

That’s a mystery that many experimental psychologists have wrestled with but now Johnson and Fowler say they have the answer. By creating a mathematical model of the way overconfident individuals compete against ordinary individuals, they show that there is a clear advantage in overconfidence.

In fact, if the potential reward is at least twice as great as the cost of competing, then overconfidence is the best strategy. In fact, overconfidence is actually advantageous on average, because it boosts ambition, resolve, morale and persistence. In other words, overconfidence is the best way to maximise benefits over costs when risks are uncertain.

That’s an interesting insight. Experimental psychologists have long known of the role of overconfidence in conflict situations and yet have been unable to explain its origin.

But it is Johnson and Fowler’s predictions that are most worrying. Their model implies that optimal overconfidence increases with the magnitude of uncertainty. So the greater the risk, the more overconfident individuals should become.

Thanks to Tyler’s Assorted Links.

Braves Attendance Recovers

Earlier in the season, Braves attendance was down about 15% relative to 2008, which was significantly less than the league’s average attendance decline of about 5%. Now, it appears that the Braves have righted the ship. Braves attendance is down 5%, which is actually better than the league-average decline of almost 7%. See Baseball-Reference’s 2008 vs. 2009 Attendance tracker.

The Expected Value of Being a Baseball Player

From my graduate school buddy Mark Steckbeck:

Doing the math, and discounting \$400,000 per year for three years, beginning at age 17 and entering the big leagues at age 21 (not likely), the expected value of a career in baseball is about \$86. Who’s likely to pursue that?

Now, certainly it’s not the average high school athlete who considers himself pro material, but It’s still predominantly those with low opportunity costs of their time that pursue the professional athlete track. Even if we changed it so that a high school athlete was ten times more likely to make it to the professional level, the expected value is only about \$860. That is total, not per year.

Thanks to Tyler Cowen for the pointer.

SGT: Sabermetric Groupthink

In the comments to my previous post regarding Ken Rosenthal’s criticism of sabermetric groupthink (SGT), a thoughtful reader posted Rob Neyer’s response to Rosenthal (not me). Though I responded in the comments, I think it’s worthy of a post of its own.

Neyer:

“In fact, in sabermetrics there’s really no such thing as groupthink. If you’ve spent any real time with sabermetricians, you know exactly what I mean.

Is there a consensus among sabermetricians that Joe Mauer deserves the MVP? Yeah, probably. But “consensus” is not the same as “groupthink.”

Not nearly the same. Groupthink (according to The Big W) is “a type of thought exhibited by group members who try to minimize conflict and reach consensus without critically testing, analyzing, and evaluating ideas.”

That’s the exact opposite of sabermetrics, which at its very heart is nothing but critically testing, analyzing, and evaluating ideas.”

I like Neyer, but “in sabermetrics there’s really no such thing as groupthink”? What sabermetrics is and what it strives to be are two different things. All groups suffer from groupthink, sabermetrics is no different than other groups. Rosenthal isn’t denying advances made by sabermetrics—he seems to agree that Mauer is his choice for MVP (as is mine)—but taking on the unnecessarily arrogant tone with regard to the correctness of certain tenets that are pushed by its club members. Flooding the inboxes of sports writers with VORP-laden snarky commentary doesn’t help the movement. Sabermetrics includes some science, but it is not all objective analysis immune from clubish behavior motivated by social aspects.

I think Rosenthal’s message was a polite and important statement that explains why many members of the mainstream media are hostile to sabermetrics. I don’t follow Rosenthal closely, but I have found him to be one of baseball’s more-knowledgeable writers. He may not agree with every tenet of sabermetrics, but he acknowledges the community and ideas; certainly he has not summarily eschewed sabermetric ideas. I read a lot of dumb things by established baseball writers who deserve to be called out. But when you start inundating people who have lived and breathed baseball for much of there lives—no less than active sabermetricians—with new acronyms that are not in their lexicon, don’t be surprised when they are confused. And getting snooty about it doesn’t help. Baseball already has a language, and there is nothing too complex in sabermetics that cannot be explained through terms and statistics understood by little-leaguers.

A Contract Incentive for Adam LaRoche

Adam LaRoche career indicates an odd performance pattern in one area. The table below reports his seasonal tOPS+, which measures how well LaRoche did in each half relative to his performance that year.

```		tOPS+
Year	1st Half	2nd Half
2004	85		113
2005	109		71
2006	77		126
2007	90		113
2008	84		128
2009	83		126
```

In five of the six years that he has played in the majors, he’s hit better in the first half than he did in the second half. This fact has not gone unnoticed by broadcasters.

I do not think that LaRoche can be counted on to repeat this pattern. In Curve Ball, statisticians Albert and Bennett look at the ability of players to repeat half-season splits and they conclude: “Players don’t generally hit any better or worse in the last half of the season than the first half of the season.” Now, this doesn’t exclude the possibility of a few players having this ability, but I think it is unlikely. I suspect that Adam’s performance represents a run in a small sample that is likely noise.

But, what if Adam is a second-half player, and a team wants him to play more like second-half Adam in the first half? How might a team structure a contract to give LaRoche the incentive to do the things he needs to do (e.g., get in shape, practice, take his medication regularly, etc.) to generate higher production. I have a simple solution: offer a big All-Star bonus. Players with strong first halves have an advantage at making the All-Star team over second-half players. Many players have All-Star bonuses in their contracts in small amounts, a few thousands dollars or so. If a full-year of second-half LaRoche is worth an additional \$2 million, offer him a \$2 million bonus for making the team. If he can fix the problem, it will likely be fixed. If not, you get the same LaRoche as always without having to pay the bonus.

The Best Few Paragraphs I’ve Read Today

The honor goes to Ken Rosenthal.

Don’t get me wrong. Sabermetricians have significantly broadened our understanding of baseball — and by “our,” I mean fans, media and club personnel, virtually everyone in the game. Advanced statistics reveal not only tendencies, but also greater truths. Smart teams effectively combine sabermetric principles with scouting orthodoxy. Very few, if any, disregard the numbers entirely.

Here’s the problem: Sabermetricians were ignored for so long, they had to shout to be heard. Now they are getting heard — properly heard in the highest levels of baseball media and front offices. But some continue to shout, dismissing those who disagree as ignorant dolts….

Baseball sparks the liveliest discussions of any sport, invites a myriad of perspectives. Slavishly adhering to sabermetric dogma reduces the level of discourse. We’re talking about an MVP race, not geopolitics. We’re supposed to debate. Good, old- fashioned quarrels are part of what makes the game fun.

Early Free Agent Salary Projections

A few moments ago I happened to run across an article by Stan McNeal on the next free agent class. I thought I’d take a moment to project the salaries for his “Best bets for big bucks” in the upcoming offseason. I’m slowly breaking out my new marginal revenue product projection system, so this year’s performances are incorporated roughly into the projection. I may provide updates after the season is over.

1. Matt Holliday, OF, Cardinals. He would have been in line for a nine-figure deal in the old economy but might have to “settle” for something closer to \$80 million over four years. Holliday still has plenty in his favor: He has had a strong second half with St. Louis, he is only 29 and his agent is Scott Boras. St. Louis fans shouldn’t get too enamored with him.

Four years, \$68 million (\$17 million per year). I don’t know about \$80 million. If he joins a good team, he might get it (my estimates assume the player is added to an average team).

2. John Lackey, SP, Angels. The big righthander, who turns 30 in October, has pitched well enough lately to cement his status as the market’s best available starting pitcher. The chances of Lackey re-upping with the Angels are no better than 50-50.

McNeal doesn’t suggest a length, so I’ll guess four years, which projects to \$56 million (\$14 million per year). The recent injury may scare some teams away, but it looks like he has returned to healthy form.

3. Jason Bay, OF, Red Sox. He enhanced Red Sox general manager Theo Epstein’s already considerable reputation by productively and professional succeeding Manny Ramirez. But there is little room for sentiment in Boston’s front office. Given a choice, the Red Sox would take Holliday.

I’ll go with four years again (why not?): \$58 million (\$14.5 per year)

4. Chone Figgins, 3B, Angels. Improved discipline has improved his on-base percentage to .400-plus and made him the game’s top leadoff hitter this season, guaranteeing him a significant raise from the \$5.775 million he is making this season. A prototypical Angel, Figgins says he wants to stay. Just don’t talk bring that hometown-discount talk his way; he has heard there will be interest from the big-money teams.

Four years, \$38 million (\$9.5 million per year). Significant raise is a go.

5. Jason Marquis, SP, Rockies. His numbers are similar to another sinkerballer, St. Louis’ Joel Pineiro, but Marquis makes this list because he has posted his numbers at Coors Field. Marquis should get a slight bump from his current three-year, \$21 million deal, but is he a \$10 million a year pitcher? Don’t think so.

The most-frustrating pitcher in the world. Four years, \$29 million (\$7.25 million per year). Nope, not a \$10-million man.