What’s Wrong with Replacement-Level Valuing of Players

In my previous post on 15 tenets of sabermetrics, my response to one tenet received the bulk of the comments. I showed that the talent of the league is distributed normally, and the assumption that there is an abundance of similarly-skilled players on the margin of the majors—the underlying assumption of replacement—is wrong. Replacement-level metrics themselves do just fine at measuring player skill relative to one another, it’s the implication of that these metrics tell us something about the objective value of players that bothers me.

I’ve written quite a bit about my dislike of replacement level over the years. You can see posts that I have written on the subject here, here, and here. But, I thought I’d briefly summarize my reasons for rejecting replacement-level theory in a single post.

— If there was an abundance of talent at the bottom of the league, then the frequency of poor play on the inferior side of the talent distribution should be evident. But as the distributions below indicate, the distribution of talent is bell-shaped.

OPS
ERA

(UPDATE: Some people don’t like the 100 PA/BFP cutoff, so I’ve replaced the histograms with 30 PA/BFP cutoffs. As I’ve said, it doesn’t make much difference. These distributions have the basic bell-shape, but are negatively skewed, not positively skewed as would be expected if there was an abundance of near-equal talent around replacement level.)

There is no grouping of poor talent on the edge of the league, which means that when seeking replacement who are not in the league, teams must select from a scarce talent pool where some players are better than others. This means there will be price differences between players, with teams willing to pay more for better players and less for worse players according to their marginal revenue products. Now, this doesn’t mean that inferior players don’t serve as competition for superior players, by offering their inferior services for a lower price, but they are not simply worth the league minimum.

— Even if there was an abundance of talent in this range, it is not freely available. Talent outside of 25-men major-league rosters is not a collection of free agents. Each team has a farm system with five or six minor-league clubs, full of players whose playing rights are owned by other clubs. That means that the next-best few thousand non-major-leaguers are off limits for clubs to choose from freely.

So, let’s say a bench player goes down and the team needs a replacement to serve in his place, the team has a few options. It can bring up the most-talented player in its minor-league system that can play the vacated position. This player will be paid a pro-rated share the league minimum while he is in the league. The value of having that player on the roster will be equivalent to his marginal revenue product, which is almost certainly worth more than the league minimum. But just playing the player isn’t free, because there is an opportunity cost to playing him. If it starts his major-league service-time clock prematurely, the team will lose future value from the player. If another team was also seeking a similar player—and given the scarcity of talent, there almost always is—the team forgoes the value it could get from selling his surplus vale (MRP – wage) to another team. And if you want to acquire a player from another minor-league roster, you have to pay the team for the opportunity cost of a player.

— The next step is to put a dollar-value on players based on their contribution beyond replacement level. (I should also add that the linear dollar-per-win estimates often used are biased by estimating a non-linear function with a linear estimator and neglecting to include a y-intercept in the estimate.) If we assume that replacement-level players are worth only league minimum, then players who play worse than replacement level are costing their teams money, because supposedly there is a superior player available for the league minimum. Sounds good in theory, but in practice, about 1/3 of the league—eight players per team—are identified as below replacement-level by the popular sabermetric website Fangraphs.

And this isn’t just some random error involved here: 16% of players cost their teams $1 million or more in 2009. This assumes quite a bit of ignorance on the part of general managers, which is doubly problematic because when free-agent salaries are used to measure the value of players, there is an implicit assumption made that GMs are making rational decisions. Paying $4 million per team for below-replacement players (16% * 25-man roster = 4 players; 4 players * $1 million = $4 million) that should be available for under $1 million is irrational.

Of course, we could respond by saying that replacement-level is being set too low. But, why bother saving a metric that has too many other problems.

— Replacement-level terminology is unnecessarily complicated. If the sabermetric community wants to continue spreading influence beyond its small sphere of influence it can express all its important concepts in terms familiar to all baseball fans. When I played little league, we talked about on-base percentage and slugging percentage. It’s easy to explain DIPS through discussing strikeouts, walks, and homers (“see, the pitcher does these things all on his own, without fielding help”). But replacement level, why not just bring up quantum physics? This is baseball, it’s supposed to be fun. Replacement-level language is complicated and it adds no useful additional information to concepts that can be expressed more simply, especially in reference to league average. Replacement-level terminology should be rejected for parsimony alone. As I previously stated,

I read a lot of dumb things by established baseball writers who deserve to be called out. But when you start inundating people who have lived and breathed baseball for much of there lives—no less than active sabermetricians—with new acronyms that are not in their lexicon, don’t be surprised when they are confused. And getting snooty about it doesn’t help. Baseball already has a language, and there is nothing too complex in sabermetics that cannot be explained through terms and statistics understood by little-leaguers.

14 Responses “What’s Wrong with Replacement-Level Valuing of Players”

  1. Tucker says:

    I disagree generally, although you are right to point out the relative scarcity of replacement level players. It seems that the problem though is that the median player at AAA is actually a -2 or -3 WAR player when the value assumptions are made that there are enough 0 WAR players in AAA to make them cheaply available. Actually if you look at this off-season the Royals just signed two outfielders- Melky Cabrera and Jeff Francoeur who appear to have true talent levels of just about 0 WAR and they are getting paid 1.25M and 2.5M. Rick Ankiel, another player worth between 0 and .5 WAR is getting paid 1.5M. In addition teams routinely pay split contracts or minor league contracts to minor league free agents that pay players who are replacement level or slightly below. These contracts usually pay between the major league minimum and 750k in total compensation. So it looks to me like the function that best fits (particularly after all of the recent RP deals) is something like 1.5M+(WAR)*4M. So yes, 0 WAR players have value and teams are aware of that value, but adding a y intercept at $1M or so doesn’t change things that much.

    There also is the randomness in performance and evaluation that makes it possible for teams to field players who perform at below replacement level. I mean Melky Cabrera again performed at below -1 WAR last year according to Fangraphs but he is probably a 0 WAR player in true talent level. It seems perfectly reasonable that two or three relief pitchers, the 6th starter who makes 10-15 starts, maybe one or two everyday player and all of the bench players actually have a true talent level of 0 WAR, get paid around $1M and half of them perform below their true talent level in any given season.

  2. Mitch says:

    Fortunately, Tom Tango has done all the heavy lifting already:
    http://www.tangotiger.net/talent.html

  3. Why would a graph of players playing in the league and not accounting for playing time give you an idea of a freely available player who is not playing right now? Your assumption here is saying that any player I add to the roster for league minimum is going to produce on average 4.25 ERA or hit .775 OPS. Of course not…so what is the talent level “on average” of that player you add.

    Your talent level distribution of major league players does not give us any information about players currently in the minors. If we used minor league equivelants and graphed them along with what you have you would see a pooling of talent on the edge.

  4. J.C., you have identified all of the market distortions in MLB valuation, which clearly perturbs efforts to model what “correct” valuation should be.

    But I think your definition of “talent” is too narrow. For pitchers, ERA measures skill level (and as you’ve plotted, it’s normal-ish with a bit of a skew to the right where a slightly higher proportion of below-average perforers dwell).

    But a player’s longevity in the elite ranks of MLB also needs to be considered when we are talking about “talent”. If you use longevity in MLB as a measure of talent (since longevity is a function of both high skill and sustained performance), you see a very different distribution curve.

    You can find my analysis here:
    http://bayesball.blogspot.com/2010/12/agreeing-with-bill-james.html

    My conclusion is that “talent” is not normally distributed — there are far more players at the lower margin than at the average, and there are even fewer in the elite “future Hall of Famer” level. But as you point out, the farm system likely ensures that this labour pool is not freely available to other clubs.

  5. Ben says:

    Looking at Martin’s charts, it looks like the “talent level” is based on an exponential curve distribution, not anything resembling a normal distribution. In that case, the 30-way division of the minor league players should play very little factor in available talent. There’s just tons of people right off the fringes of the league who can almost play in MLB. If you think about it, that makes sense: if there weren’t, having 30 AAA teams wouldn’t make any economic sense.

    While I agree that replacement level logic can be convoluted, it does model something that is true about baseball. There are very few people who can play to the level of MLB, but there are a ton of people who are just off the fringe.

    From my perspective, the black magic is not what replacement level means, it’s how you decide where to put replacement level as a basis of comparison, but that’s a discussion for another day.

  6. JC says:

    Martin,

    I address why cumulative distributions are not appropriate for measuring available replacement-level talent in any one season in the comments of my previous post.

  7. Ryan M says:

    Have you read the seminal articles on the topic of dollar valuation and replacement level? I’m not talking the stuff from five years ago. The earlier things by Pappas and Silver basically anticipate everything you say.

    -First, the distribution you show means literally nothing; it’s a biased sample. You’re showing the value of all players who get playing time in the MLB, not the ability of all available players. BASIC imagination of what AAA players is all it takes to make that distribution not look normal anymore.

    -“Freely available” has always and everywhere been purely metaphorical. The point in the methodology was that transaction cost is so low relative to salaries that it’s acceptable to round it down to zero as a simplification. This is a pretty standard economic assumption in economic models. For you as an economist to object to it is strange.

    -On the marginality part of that- while fangraphs currently uses a weird, Sraffian account of value, Pappas’s original way of looking at this was marginal $/marginal win in rating teams, and then he went back and found out what the over was. The concept of replacement player in that construct was, what quality of a team can a team field spending the league minimum on performance. There is nothing inconsistent with that paradigm with economic theory. Or just look at Silver’s essay on valuing Alex Rodriguez in Baseball Between the numbers, where the economics is all the more obvious. I’m okay with fangraphs published statistic because the results are close enough to what they were before, even if the methodology is no longer perfect.

    -The older methodology was not biased in the way you say it is- it had an intercept term set at the league minimum salary, not $0.

    -As an economist, you should know that business owners maximize EXPECTED value, not value itself. The dollars lost are ex post, not ex ante. General Managers aren’t that stupid. Baseball is just hard to predict. Again, if you were more familiar with the history of sabermetrics here, you’d know that BP’s WARP once had a much lower baseline, but that incorrect baseline caused more problems than it would “fix” today by making it look like GMs are less dumb.

    -The concept of “replacement level” in sabermetrics is, when done correctly, the precise analogue of marginalism in economics. Marginalism itself is also unintuitive and complicated. Do you suggest we throw out marginalism because it’s difficult to teach in an introduction to economics?

    I started reading about replacement level in sabermetrics when I was 16 years old. It introduced me to the economic way of thinking. I’m currently a second year PhD student in economics (you flash your degree for credibility, so will I). All of these criticisms, without exception either 1) are a criticism of the current method of translating marginal wins into marginal dollars, which is a criticism of fangraphs/MGL/tangotiger and not replacement level or 2) inconsistently criticize assumptions of these models when they are something economists do all the time (zero transaction cost, the use of expected value instead of ex post value), or 3) are just bad statistical reasoning (looking at the current distribution of talent instead of what the distribution of talent would look like if everyone in AAA played against current major leaguers).

  8. Colin Wyers says:

    I think this misses the mark on two points.

    As for why we need replacement level, you did a pretty good job of spelling out the point yourself:

    I think this misses the mark on two points (well, I will offer a rebuttal on two points – I probably could offer disagreements on more).

    As for why we need replacement level, you did a pretty good job of spelling out the point yourself:

    “The $6.8 million estimate says [Francoeur is] worth well more than that, but the estimate is misleading. The MRP estimates give credit for the playing time, and Francoeur’s managers have played him far too much. … For his career, Francoeur has an OPS of .743, his career OPS+ of 92 is equal to his overall 2009 performance. He’s no Natural, but he is a useful player who can serve in a platoon/reserve role. But, that’s not how the Braves or the Mets used him. Since his first full season in the majors he’s averaged 666 plate appearance a year. While this number may be appropriate for the anti-Christ, it’s not the number of PAs that any manager should be giving Jeff Francoeur. Jeff Francoeur should probably play 50% of what he has played, cutting his MRP estimate in half.”

    Neither an absolute or an average baseline are appropriate for determining the value of playing time; replacement level is proposed to provide the value of playing time. It may not do so correctly, but outside of your estimate of Francoeur’s excess playing time (I would be interested to hear how you came to that figure) I don’t know that you’ve proposed an alternative.

    As to the distribution of players – what you’re showing us is that the opportunities for replacement-level play are limited at MLB. That does not disprove the concept of a fungible replacement level at all, and I can’t fathom why you think that it does. The concept is predicated on the notion that there are more players capable of playing at replacement level than there are opportunities for players of that talent level in the major leagues. What you’ve shown us is that there are relatively few opportunities for replacement-level players in the majors – left unanswered is the quantity of players who, if given a chance at playing in MLB, could perform at that level of talent. Again, you say:

    “If Bill James was referring to baseball talent in the world as a whole or all baseball talent, then this is an odd statement to include in a primer of basic tenets of sabermetrics. Really, there are a lot more people in the world who aren’t as good as the average of people who play baseball? I’m not sure how that knowledge is new or useful.”

    Essentially, what we expect is a normal distribution, where you and me and all the other commenters are somewhere between 1 SD of the average (I’m presuming that the bulk of your readership is male and between the age of 20 and 35 or so, which roughly approximates the major league player population as well) and all of MLB is somewhere like 5-6 SDs to the right side of the distribution. (In other words, if any of us was pitching, it would be nearly impossible to tell the difference between Albert Pujols and Yuniesky Betancourt, statistically.) And looking at it from that point of view, we should expect the behavior that James describes, where you have more players 10% below the MLB average than you do at 10% above it.

    Now if ALL (or even a lot of them) of those players were in MLB, that would in fact be evidence against replacement level, because it would indicate that the playing time available for these players was in fact equal to their numbers. But that’s not what we see. Now, you point out that teams don’t have access to the entire talent pool of sub-marginal players, because the minor leagues. But the counterpoint is – they have a readily available supply of these players, already under contract, because of the minor leagues. Each team has a 40-man roster, only 25 of which are active at any time. That’s not FREE, in the sense that MLB teams are paying substantial sums in the draft, minor league payroll, coaching staff, etc. But to promote one of those players to the 25-man roster in-season is a negligible expense on top of that. And that’s your “replacement pool” that teams can draw from.

  9. J.C., you said “cumulative distributions are not appropriate for measuring available replacement-level talent in any one season”.

    Point taken — but rhetorically, what is the appropriate length of time? As you noted in one of your responses to the earlier posting, some of those coffee drinkers in 2009 will still be in the minors and available in 2010, while others will be out of pro ball altogether. At the same time, though, players like Greg Maddux and Jamie Moyer have demonstrated that a talented player can have a successful 25+ year career. Yes, players like that are a freak of nature, but that’s what happens out at the extreme end of the long tail.

    Thinking about it further, I wonder if we’re not defining “talent” in two very different ways. My usage of “talent” combines skill & longevity, which then looks like my chart. (I’ve been operating under the assumption that this is what Bill James meant.) The way you’ve defined “talent” is as (and please correct me if I’m wrong) “skilled labour available now”.

    If that’s the case, then there’s no wonder our charts look different.

  10. LarryM says:

    As I said in the last thread, I have soeme sympathy with what you are trying to do here, but I think your dogmatism on the point is misplaced.

    Part of the problem is the difference between economic value and “performance” value – different but related concepts. It may be – probably is – possible to assign economic values without considering “replacement value.”

    But for answering the age old question “who is better” – or “who is more valuable” for players who don’t have identical playing time – the concept of replacement value is quite necessary. There may be methodological problems with figuring out just what that value is, but consptually the options are problematic.

  11. I thought I’d take a look at the ERA distribution of all active pitchers in 2009, not just the 100+ BFP. My summary is that the <100 BFP group are worse (and the difference is statistically significant).
    http://bayesball.blogspot.com/2010/12/era-distribution-curve.html

  12. LarryM says:

    Another point: unless I’m misreading you, you seem to advocate a method of player valuation comparing players to average players. Yet you also argue that there is a scarcity of major league players at the lower end of the major league talent spectrum.

    It should be obvious why those positions contradict each other – the scarcer talent is, the more distorted valuations based upon comparisions to an average player become.

    The logical conclusion from an extreme belief in the scarcity of talent at the lower end is that player values should be calculated on a scale starting from zero. Oviously this produces a distortion as well, but at least it would make sense in terms of your beliefs about talent scarcity.

Trackbacks/Pingbacks

  1. […] J.C. Bradbury explains why he doesn’t like replacement level valuation. I find his last argument less than compelling: Replacement-level terminology is unnecessarily complicated. If the sabermetric community wants to continue spreading influence beyond its small sphere of influence it can express all its important concepts in terms familiar to all baseball fans. When I played little league, we talked about on-base percentage and slugging percentage. It’s easy to explain DIPS through discussing strikeouts, walks, and homers (“see, the pitcher does these things all on his own, without fielding help”). But replacement level, why not just bring up quantum physics? This is baseball, it’s supposed to be fun. Replacement-level language is complicated and it adds no useful additional information to concepts that can be expressed more simply, especially in reference to league average. Replacement-level terminology should be rejected for parsimony alone. […]

  2. […] This post was mentioned on Twitter by J.C. Bradbury and David Napoli. David Napoli said: Replacement level=Quantum physics? Strawman RT @jc_bradbury What’s Wrong with Replacement-Level Valuing of Players http://bit.ly/fT1WmM […]