Much Ado About Nothing? The Save Statistic and Reliever Usage

The save statistic has come a long way from its humble origins in The Sporting News’ books: over the past several years, it has become a lightning rod of controversy in the analytic baseball community. To hear Michael Lewis tell it, several general managers have been fleeced by Billy Beane because of the importance they assign to pitchers with lots of saves. Many productive work-hours have been lost to internet flame wars between the camp of folks who allege that the “closer mentality” is a fiction and those who point to Arthur Rhodes and Octavio Dotel while shouting, “Behold!” The save itself has come under attack; Joe Morgan even wrote a column assailing the save statistic itself; he notes the hypocrisy in giving equal credit to the pitcher who enters the inning with a three run lead and allows two to score before recording the third out and to the pitcher who saves a win from a bases-loaded, no-out jam. A final criticism, headlined by a series of columns by Steve Treder at the Hardball Times, alleges that ace relief pitchers are being misused by managers who focus on maximizing saves rather than team wins.

To better understand the save, I find it instructive to remember from whence it came. Jerome Holtzman, a Chicago sportswriter, created the statistic in 1960 to solve what he saw as a growing problem: the proper valuation of relief pitching. As he tells in this very instructive article:

I invented the first formula for saves in 1960, in my fourth season as a baseball beat writer. At that time there were only two stats to measure the effectiveness of a reliever: earned run average and the win-loss record. Neither was an appropriate measure of a reliever’s effectiveness.

The ERA wasn’t a good index because many of the runs scored off a reliever are charged to the previous pitcher; the reliever’s ERA should be at least one run less than a starter. The W-L record was equally meaningless; the reliever, particularly the closer, is supposed to protect a lead, not win the game.

Holtzman discusses how Elroy Face was acclaimed as a good relief pitcher for his 18 wins in relief despite the fact that he allowed the other team to tie or lead the game in 10 of those victories. Thus, the save statistic was born out of a desire to attribute credit where credit was due.

So, you might ask, what’s the problem? In two words: Tony LaRussa. Holtzman explains succinctly what happened:

… Instead of bringing in their best reliever when the game was on the line, in the seventh or eighth inning, which had been the practice in the past, [managers] saved him for the ninth. The late Dick Howser and Tony LaRussa were mostly responsible for this change in strategy.

In other words, managers would refuse to use their best reliever before the ninth inning, even when circumstances called for doing so.

In John’s enlightening article about advanced reliever statistics, he introduces the concept of leverage: roughly speaking, the relative importance of any given at-bat to the overall outcome of the game. That is, the outcome of an at-bat in the third inning of a 9-0 game is not nearly as important in terms of altering the probability of victory for either team as is the outcome of a bases-loaded at bat in the bottom of the ninth of a tie game. To think of it another way, leverage is the amount of “pressure” put on a reliever: when he enters the game with a big lead (or facing a big deficit), there is little pressure on him to perform. In a tie game, he faces lots of pressure. New statistics have been invented by smart people like TangoTiger and Dave Studeman to measure in exact terms the amount of pressure each reliever faces (leverage or “p”) and the success he has in those situations (expected wins added or WPA).

Treder and others argue that ace relief pitchers are not used properly because their managers use other inferior pitchers in “pressure” situations, when instead they should ignore the save and send in the closer. In essence, this argument says that closers are not used in enough high leverage situations. Thanks to the good people at Baseball Prospectus, this argument is easy to test. They publish a statistic (aptly titled “Leverage”, or LEV) which measures “the change in the probability of winning the game from scoring (or allowing) one additional run in the current game situation divided by the change in probability from scoring (or allowing) one run at the start of the game.” See their site for further explanation. Below is the 2005 leaderboard in leverage, with 15 IP minimum:

Rank	NAME            TEAM	LG	YEAR	G	IP	LEV	Closer?
1	F. Rodriguez	ANA	AL	2005	39	42.7	2.33	Yes
2	Bob Wickman	CLE	AL	2005	41	40	2.22	Yes
3	Ugueth Urbina	DET	AL	2005	25	27.3	2.19	Yes
4	Joe Nathan	MIN	AL	2005	44	43.7	2.18	Yes
5	T. Hoffman	SDN	NL	2005	38	36	2.09	Yes
6	Scot Shields	ANA	AL	2005	53	65.3	2.01	No [1]
7	Chad Cordero	WAS	NL	2005	54	57.7	2.01	Yes
8	Jose Mesa	PIT	NL	2005	37	39	1.99	Yes
9	D. Hermanson	CHA	AL	2005	38	40	1.97	Yes
10	Y. Brazoban	LAN	NL	2005	47	44.3	1.97	Yes
11	Akinori Otsuka	SDN	NL	2005	44	45	1.92	No
12	Juan Rincon	MIN	AL	2005	46	47.7	1.88	No
13	Octavio Dotel	OAK	AL	2005	15	15.3	1.87	Yes
14	J. Isringhausen	SLN	NL	2005	40	37.3	1.86	Yes
15	Eddie Guardado	SEA	AL	2005	36	36	1.8	Yes
16	Glendon Rusch	CHN	NL	2005	24	29	1.79	No
17	Brandon Lyon	ARI	NL	2005	18	18.3	1.76	Yes
18	Troy Percival	DET	AL	2005	26	25	1.74	No [2]
19	Latroy Hawkins	CHN	NL	2005	21	19	1.69	No [3]
20	Luis Ayala	WAS	NL	2005	56	59	1.66	No

[1] Urbina was used as Detroit’s closer when Troy Percival was injured; he recorded 9 saves before being traded.
[2] Shields was used as Anaheim’s closer for a few weeks when Francisco Rodriguez was injured.
[3] Hawkins was briefly used as the Cubs’ closer before being moved to the setup role and eventually being traded.

As you can see, a majority of the names on this list (13) were their team’s primary closer while healthy. However, several teams have two pitchers on this list: Anaheim (K-Rod and Scot Shields), San Diego (Hoffman and Otsuka), Washington (Cordero and Ayala), Minnesota (Nathan and Rincon), and Detroit (Urbina and Percival). In every instance except Detroit (which is an exception because of Percival’s injury problems), the team’s closer was used in higher leverage situations. Let’s look at the 2004 list (minimum 30 IP):

Rank	NAME            TEAM	LG	YEAR	G	IP	LEV	Closer?
1	Trevor Hoffman	SDN	NL	2004	55	54.7	2.17	Yes
2	Eric Gagne	LAN	NL	2004	70	82.3	2.11	Yes
3	Joe Nathan	MIN	AL	2004	73	72.3	2.06	Yes
4	Danny Kolb	MIL	NL	2004	64	57.3	1.99	Yes
5	Jose Jimenez	CLE	AL	2004	31	36.3	1.96	No
6	F. Cordero	TEX	AL	2004	67	71.7	1.95	Yes
7	Eddie Guardado	SEA	AL	2004	41	45.3	1.89	Yes
8	Greg Aquino	ARI	NL	2004	34	35.3	1.86	Yes
9	Octavio Dotel	OAK	AL	2004	45	50.7	1.84	Yes
10	Mariano Rivera	NYA	AL	2004	74	78.7	1.83	Yes
11	Todd Jones	CIN	NL	2004	51	57	1.82	No
12	Arthur Rhodes	OAK	AL	2004	37	38.7	1.79	No [1]
13	Rodrigo Lopez	BAL	AL	2004	14	31.7	1.74	No
14	Tim Worrell	PHI	NL	2004	77	78.3	1.7	No
15	R. Betancourt	CLE	AL	2004	68	66.7	1.69	No
16	Troy Percival	ANA	AL	2004	52	49.7	1.69	Yes
17	A. Benitez	FLO	NL	2004	64	69.7	1.68	Yes
18	John Smoltz	ATL	NL	2004	73	81.7	1.68	Yes
19	Tom Gordon	NYA	AL	2004	80	89.7	1.67	No
20	Jorge Julio	BAL	AL	2004	65	69	1.66	Yes

[1] Jimenez spent part of the year as a closer as the Indians desperately tried to find someone who didn’t suck in their bullpen.
[2] Rhodes began the season as closer, but was moved out of the role when Oakland traded for Octavio Dotel.

Again, thirteen players on this list were their team’s primary closer, two others were used in that role for part of the season, and Tom Gordon was used in lower-leverage situations than Mariano Rivera. In general, closers seem to have been used in more high-leverage situations this year than last year: four pitchers on the 2004 list had a higher Leverage value than their team’s closer, while no one on the 2005 list meets that criteria.

There are a few important caveats to take away from this brief look at leverage. First of all, the above lists do not measure success; just because a pitcher was used in high-leverage situations doesn’t mean he was any good (and indeed, Jose Jimenez wasn’t). Second, just because a closer shows up on this list doesn’t mean he was used optimally. Further research can be done into the situations when a closer could have been used and compare his “ideal” leverage with actual leverage. Finally, managers are not always successfully identifying their best relievers, so using the closer in high-leverage situations might be sub-optimal (like Jorge Julio and BJ Ryan last year).

For all the grief the save statistic receives, it still does what it was designed to do: measures an aspect of reliever performance that W/L record and ERA do not capture. And despite what may have happened in the past, managers seem to be successful at deploying who they perceive as their best relievers in crucial situations.

As a closing note, I’d like to thank JC for giving me the opportunity to fill in for him while he’s away. It’s an honor and a privilege. If I can come up with anything a tenth as interesting as what he writes daily, I’ll be lucky.

10 Responses “Much Ado About Nothing? The Save Statistic and Reliever Usage”

  1. urlhix says:

    Kyle, nice backstory on the origin of saves. Interesting stuff, fo sho. Wierd to see F-rod at the top this year though. I definitely would have guessed he would be behind Izzy. And Todd Jones should be on the 2005 list somewhere. I guess that is the problem with looking at a “full season” and not accounting things on a shorter time frame. Is there a way the opporotunites gained/lost due to injury can be taken into account?

  2. urlhix says:

    I guess what I mean, for example, is that Brazoban wouldn’t be on the list if it weren’t for Gagne’s injury. Likewise for T. Jones. And Dotel shouldn’t be there for the same reason.

  3. Kyle says:

    This is somewhat a function of the low IP threshhold I used (15 IP in 2005, 30 IP in 2004). But I think that your point just makes my conclusion stronger. Gagne’s injury has forced Brazoban to assume the closer role in LA. However, his appearance on the list means that he has been highly leveraged while in the closer role. This suggests that if Tracy reserves the closer role for his best reliever (which he obviously does, since he uses Gagne there), the closer is used in the highest-leverage situations of any reliever on the Dodgers.

    As I said above, this usage might not be optimal (is he leveraged “enough?” is he even their best reliever?) but it’s not like Dodger closers are only trotting out to the mound with 3-run, bases-empty leads to protect.

  4. John W says:

    Excellent piece. I remember telling myself not to go off on saves while writing my article, since that could be another article by itself. I’m working on something else right now, but I am planning to write about a stat I’m calling “Usage Score.” It’s an attempt to measure whether or not a manager is making the most of his relief corps, but I only have data for the Braves. Of course it would be interesting to see how all the managers stack up, if only I had the data.

  5. urlhix says:

    Gotcha. I think I can wrap my non-stat oriented brain around it now. I love reading about this kind of stuff even if it takes me a bit to get it. Thanks again for the great article.

  6. studes says:

    I agree that Leverage is correlated with Saves in aggregate. And that’s a useful thing to remember. But there are a lot of flaws with this approach. And I’m not sure it follows that “managers seem to be successful at deploying who they perceive as their best relievers in crucial situations.”

    I would think that assertion would require a game-by-game review of reliever usage, such as the one I used in this article (see the graph at the end for an example of how managers DON’T use their relievers optimally).

    Or, you could at least repeat this same list by looking at the median, or mode, instead of average. I think this would tell you a lot more about manager’s usage than averages do.

    That’s because middle relievers are more often used during garbage time, which brings their average leverage down, and closers aren’t. Just because a pitcher is used in garbage time doesn’t mean his other appearances haven’t added as much as a closer’s.

    Another way saves are misleading is that they all tend to go to the closer, if healthy for the entire year. Yet the top middleman could be right behind the closer in terms of leverage yet receive no saves.

    Finally, why did you list the top fifteen in Leverage? Wouldn’t it have been most useful to list each team’s leaders in Leverage, compared to its saves leader? Or, more precisely, since leveraged opportunities differ by team, shouldn’t you be looking at the distribution of leveraged opportunities by team?

    Just thought I’d share my thoughts. This is a complicated subject, and your post is helpful.

  7. Kyle says:

    Thanks for your comments, studes. I agree with most of them. Obviously, my “study” (not my term, surely) is not scientific at all. Hence, it lacks lots of the precision and detail that both your and Steve’s articles both feature in spades. I’ll lay at least some of the blame for that on a lack of data; I do not have per-game WPA or leverage statistics by player or team. For example, I would love to look at median leverage or leverage distribution (as you did), but I simply don’t have that information. My sources are the Lahman database and what I can at least somewhat quickly glean from the internet.

    The impetus from my article came from John’s leaderboards in the previous post. I noticed that most of the pitchers there are assigned the closer role. I next remembered that in one of Tango’s old LI leaderboards from a few years back, I was surprised to find that wasn’t the case then. I decided to take a very brief look at the most-leveraged pitchers so far this year and last year, and see if they were in fact mostly being used as the team’s article.

    There are certainly better ways of proving or disproving the statement, “managers seem to be successful at deploying who they perceive as their best relievers in crucial situations” that what I have done. Your list of ways are all superior to what I have provided here.

    Ultimately, my article isn’t meant to compete with the really hardcore stuff you guys at THBT do (perhaps the forum of Sabernomics is a poor one for what I’ve written, but I agreed to write something for JC, and this is what came out). I don’t have pitch-by-pitch or play-by-play data at my disposal. I just wanted to show that the highest-leveraged pitchers on average tend to be closers nowadays. Certainly, managers can do a better job by sending their closers into more tie-game situations. And I could do a lot more research :)

    Thanks again for your comments, though, as they are well-thought-out and well-taken.

  8. JF says:

    In addition, it’s really hard to figure out whether a manager is optimizing his leverage situations since when he puts a pitcher in he doesn’t know what the leverage situations the rest of the game will be. The obvious exception, and the one which managers all seem to get, is, let’s say a one run lead with runners on second and third in the top of the eighth. Even dimwitted managers understand that if reliever A fails, there’s nothing for the closer to close, so we get the closer an out early.

    The hard situations occur when the high leverage situations come early. Using your closer is an irreversible decision, so real options theory (nice to put this in an economics blog) tells you to downweight the weight of early leverage.

Trackbacks/Pingbacks

  1. equipment physical therapy wilmington

    Wow that’s a nice post .

  2. 50 ways to leave your lover Ringtone

    Download the ringtone of the popular song: 50 ways to leave your lover