Overestimating the Fog

This morning, I ran across an article by Alan Barra in the WSJ that reminded me of a blog post that I have been meaning to write for several years. Barra discusses the ability of players to perform in “clutch” moments. In closing, Barra cites Bill James as an agnostic regarding clutch ability, citing the last line of his article Underestimating the Fog, “Let’s not be too sure that we haven’t been missing something important.”

James’s article caused a bit of a stir when it was first published. Here was James arguing that several common notions among sabermetricians—including that clutch ability is a myth—were not necessarily so. As James metaphorically stated the problem, clutch hitting lay in a fog beyond a sentry, on the lookout for approaching forces. In a thick fog, the enemy may be invisible despite existing in strong numbers. The fog that obscures the view of the guard is similar to the fog that shrouds the randomness in baseball that makes it difficult for us to identify ability from chance. While we have methods for disentangling luck from ability, there exists the possibility that clutch ability is real and we just haven’t found a way to see through the fog properly. Therefore, we shouldn’t be so quick to believe an idea to be true, even when the bulk of the evidence we have indicate that it is true. Maybe the truth is just lost in the fog.

No one can deny this. Of course it is possible that clutch ability exists and we just haven’t found a way to measure it properly. But we dismiss lots of other possible events as likely outcomes everyday with good reason. And the tradeoff of acting with too little evidence must be balanced against not acting with sufficient evidence. It’s a dilemma familiar to all scientists. This is explained with the distinction between type I and type II errors.

Let’s begin with the null hypothesis that player performance in clutch situations is identical to performance in non-clutch situations. A type I error occurs when we reject a correct null hypothesis. Studies of clutch hitting find that performance differences in these situations are small and often not statistically meaningful. The null stands and clutch-hitting skill is seen as a myth. A type II error occurs from not rejecting an incorrect null hypothesis. When James advocates agnosticism towards clutch-hitting as a skill, it is because that despite the studies showing little evidence of clutch-hitting he wants to avoid committing type II error. The problem is, this choice between type I and II errors isn’t free. By raising the decision criterion to avoid type II error, you necessarily increase the chance of committing type I error.

Identifying clutch hitting is practical problem that requires a decision involving real costs. Should a team factor in clutch ability when choosing between free agents. Should it matter for the manager choosing among pinch hitters? Should a historically big-game pitcher start the playoff series over your regular season ace? Based on the available evidence, if I had to decide between Jeter or A-Rod it’s not even close: Alex Rodriguez is a far superior player to Derek Jeter, and that’s what is relevant. And in cases were the players’ performances are more similar, I wouldn’t consider clutch performance for even a moment. If clutch ability exists, it would show up in bunches using the empirical methods already employed by researchers seeking to study the question.

In my view, the fog is a distraction: something to bring up to keep the argument going. But arguing takes time, which is valuable. Let’s stop it with the fog, already. Of course it’s possible that something exists that just hasn’t been discovered yet (e.g.the Loch Ness Monster, Sasquatch, ergogenic effects of HGH); but the evidence we have says these things don’t exist, and hanging hopes on the possible isn’t a very persuasive argument.

9 Responses “Overestimating the Fog”

  1. Ken Houghton says:

    Eric M. Van, who worked for the Red Sox recently, told me a few years ago that he has analyzed and can prove clutch hitting exists. But I haven’t seen or heard anything from him since about it.

  2. Rick says:

    EV says a lot of things Ken. One thing he tends to not do is tell you how he reached whatever conclusion he came up with. I’m on a board with him and I’ll see if I can find one of the threads about clutch hitting that he posted in. Check back here in a day or two to see if I was successful.

  3. J.C.,

    In Bill James’ “Mr. Clutch,” as it appeared in The Hardball Times Baseball Annual 2008, he said that…

    “Clutch” is a complicated concept, containing at least seven elements:

    1. The score,
    2. The runners on base,
    3. The outs,
    4. The inning,
    5. The opposition,
    6. The standings,
    7. The calendar.

    Here’s a link to that feature, as it was reprinted in SI:


  4. Joe says:

    It depends on how you define clutch. Is hitting with RISP, clutch no matter when or what the score is? I think so.

    If you compare a batter’s stats with nobody on base vs RISP, some hitters have consistent numbers, far above the ML average difference. Mike Sweeney for example.

  5. DC Stack says:

    This is an interesting piece but you have neglected to complete the scientific approach to hypothesis testing. In hypothesis testing when we do not find evidence of an effect we never say there is no effect. The proper language is that we “fail to reject the null.” We never accept the null only fail to reject. The language is used in this manner because we continue to accept the possibility that an effect exists but that just haven’t been able to observe the effect with the data we have. This is exactly what James is saying about clutch hitting.

  6. JC says:

    Clutch is indeed a complicated concept, and I don’t mean to drag anyone into a debate over it. I haven’t seen much evidence that it exists. But, my point isn’t so much about clutch as it is epistemology. “The fog” isn’t some new concept; and, in my mind, the fact that something is possible isn’t good reason to remain agnostic.

  7. Jim Glass says:

    If clutch hitting exists, if it exists at all, only as an effect so small that it might perhaps exist in some fog beyond the horizon, but never visibly before our eyes, then it’s not worth shelling out a visible portion of a team’s payroll to obtain. Nor worth visibly changing the team’s lineup. I think that’s the practical point.

    There are lots of things I can’t prove don’t exist regarding which it would be foolish of me to bet money on the assumption they do.

  8. Millsy says:

    I think the problem is that to find evidence for clutch hitting (even if there is such a thing), there would have to be some sort of inefficiency in managing strategies between the two clubs. If we assume there is a clutch hitting skill, then why wouldn’t there be a clutch pitching skill (I think you allude to something of that sort in your book, JC).

    If that’s the case, then both managers should be optimizing their clutch from the defensive, as well as the offensive POV. Under this assumption, the results (or statistical data) would be a wash and there shouldn’t be significant evidence in the data for clutch hitting. If a manager doesn’t put in their so-called ‘clutchy’ guy when the other manager has in their ‘clutchy’ pitcher, then he isn’t optimizing his strategy. Unfortunately, for those trying to discern a ‘clutch’ skill, the only thing that would happen by putting in the ‘clutchy’ hitter is arriving back at the expectation that we originally had for the event.

    But that doesn’t necessarily mean it’s not a repeatable skill. If, for some reason, there isn’t clutch pitching, then perhaps the idea of clutch hitting would be more manageable. Even if it were the case that it exists, I can only imagine it is quite tiny and probably not of interest in payroll as Jim states above. I think this ‘fog’ is just too thick for us to really find anything, and what could be found probably isn’t all that worth finding.

  9. Jim Glass says:

    Perhaps we should consider the opinions of other “experts”. For instance, on a notable recent case history…
    NY Post, today:

    Alex Rodriguez’s newfound playoff prowess after years of choking in the post-season is a product of his steamy — and surprisingly honest — romance with sexy screen siren Kate Hudson, a team source and a top sports shrink said yesterday….

    A team insider said A-Rod has ditched his philandering ways and is making a big effort to inject honesty and openness into his relationship with the actress.

    The healthy off-field relationship with Hudson is translating into October success on the baseball diamond, experts said.

    “If he’s becoming a little more honest . . . he would have less anxiety,” said Palm Beach sports psychologist Dr. John Murray. “He would sleep better at night and be more relaxed. More focused. That is key.”

    While racking up a paltry .212 lifetime batting average in the playoffs, he carried on “extramarital affairs and other marital misconduct,” according to papers filed by his ex-wife, Cynthia. In postseason play from 2005 to 2007. A-Rod had a grand total of one RBI…

    But this year, A-Rod has “looked really relaxed, really great,” Murray said.

    He has hit .500 over two games and smacked five RBIs, and his game-tying, ninth-inning homer Friday night set up a Yankee win.

    “If you get somebody like a gorgeous woman, someone who you admire, somebody who’s behind you, [athletes] know it,” Murray said….
    Joe DiMaggio had Marilyn Monroe. But nobody told Ted Williams.