Archive for January, 2005

Squares for squares

UPDATE: You can now take that information with you to your Super Bowl party and pick the squares most likely to win. Get the Pro-Football Reference Super Bowl Squares App (for iPhone and Droid).

My understanding is that a Super Bowl Squares gambling event is intended to be a completely random lottery type situation where casual gamblers have as good a chance as sharks. Usually, the squares are assigned to people before numbers are assigned to squares, which means that every participant has an equal chance of getting the plum squares. But J.C. claims he has participated in squares pools where the numbers were pre-assigned and people were actually able to choose their squares on a first-come-first-serve basis.

This post is for you if (a) you watch Super Bowls with the unscrupulous and the naive, as J.C. apparently used to do, or (b) you auction off the squares. And if there’s any group of people that fall into the latter category, it’s got to consist of economists. What follows is a list of which squares are most likely to house the winning name at night’s end. It was generated by looking at the final scores of all regular season NFL games since the 2-point conversion rule was instituted in 1994. Some details follow.


Square(s)   PCT
---------------
70/07      3.80
74/47      3.71
03/30      3.21
41/14      2.23
04/40      2.04
71/17      1.93
77         1.93
73/37      1.93
00         1.71
63/36      1.63
44         1.59
60/06      1.54
01/10      1.48
34/43      1.45
18/81      1.35
33         1.19
31/13      1.00
80/08      0.95
16/61      0.95
76/67      0.89
11         0.85
78/87      0.83
64/46      0.82
97/79      0.80
57/75      0.76
90/09      0.74
84/48      0.72
93/39      0.72
50/05      0.72
58/85      0.69
69/96      0.69
49/94      0.67
24/42      0.65
83/38      0.61
02/20      0.56
91/19      0.56
66         0.56
32/23      0.54
35/53      0.52
45/54      0.50
86/68      0.48
72/27      0.46
88         0.41
26/62      0.35
21/12      0.33
52/25      0.33
92/29      0.32
89/98      0.30
95/59      0.24
15/51      0.24
65/56      0.22
82/28      0.22
99         0.19
55         0.19
22         0.04

What that means is that, for example, 7.6% of the time one team has a 7 and one team has a 0. We’ve got to cut that in half to meaningfully compare it with, e.g., the 00 square, so that’s why the 3.80% appears next to 70/07. It’s also why the numbers in the righthand column don’t add up to 100.

So, to recap, here is the official Doug Drinen plan for making friends at your Super Bowl party.

1. As soon as the squares are drawn, find the people with the 70 and 07 squares. Offer them $3 for their square. If they accept, tell them that they just traded an expected payoff of $3.80 for a mere $3.00. Call them suckers.

2. Find the guy with 22 and let him know that since the merger in 1970 there has only been one single 22 game (it was the Dolphins and Bills in week 13 of this year). Call him a sucker too.

UPDATE: You can now take that information with you to your Super Bowl party and pick the squares most likely to win. Get the Pro-Football Reference Super Bowl Squares App (for iPhone and Droid).

The Super Bowl Experience

With all the media analysis of the Super Bowl, there are a few important factors that I’m surprised no one has
picked up
on:

  • 1. The Eagles have a receiver by the name of Terrell Owens who
    is pretty good. The thing is, he had ankle surgery a few weeks ago and it’s not clear whether he’ll be healthy
    enough to play in the big game. This is an overlooked factor that is worth keeping an eye on.

  • 2. I did some extensive research and unearthed the following fact: most of the Patriot players have a lot of Super Bowl experience, while most of the Eagles players don’t have
    any at all. Here’s the way I see it. The Super Bowl is not like a typical game; you might even call it a circus
    atmosphere. Now try and stay with me here. The players are under extra scrutiny from the media and there are a
    whole host of distractions to deal with. It seems to me that the team that has more experience dealing with
    these distractions would have a significant advantage in the game. I find it odd that no one else is talking
    about this.

Alright, time to get serious. Let’s talk about that second one.

We’ll go through the last 20 or so Super Bowls and find the ones that pitted Super Bowl experience vs. a lack
thereof. As a rough first cut, I’ll define a team to have Super Bowl exerience if they played in one of the two
most recent Super Bowls.

  • 2003 – New England vs. Carolina: the team with Super Bowl experience wins.
  • 2001 – St. Louis vs. New England: the team without Super Bowl experience wins.
  • 1998 – Denver vs. Atlanta: the team with Super Bowl experience wins.
  • 1997 – Green Bay vs. Denver: the team without Super Bowl experience wins.
  • 1995 – Dallas vs. Pittsburgh: the team with Super Bowl experience wins.
  • 1992 – Buffalo vs. Dallas: the team without Super Bowl experience wins.
  • 1991 – Buffalo vs. Washington: the team without Super Bowl experience wins.
  • 1987 – Denver vs. Washington: the team without Super Bowl experience wins.
  • 1984 – Miami vs. San Francisco: the team without Super Bowl experience wins.
  • 1983 – Washington vs. Oakland: the team without Super Bowl experience wins.

I realize that Super Bowl experience is not a binary concept and that my two-year rule is arbitrary. In particular,
you can throw out 1984 and maybe 1987 and 1991 if you like. The rest of them are clear cut. What is far from clear
cut is that Super Bowl experience is worth anything.

Ah, but Belichick is a master; it’s the coach that’s the key. OK, Here are all Super Bowls since 1980 where a head
coach who had (head) coaching
experience in the big game squared off against one who didn’t.

  • 2003 – Belichick vs. Fox: the coach with Super Bowl experience wins.
  • 1999 – Vermeil vs. Fisher: the coach with Super Bowl experience wins.
  • 1997 – Holmgren vs. Shanahan: the coach without Super Bowl experience wins.
  • 1996 – Parcells vs. Holmgren: the coach without Super Bowl experience wins.
  • 1994 – Seifert vs. Ross: the coach with Super Bowl experience wins.
  • 1992 – Levy vs. Jimmy Johnson: the coach without Super Bowl experience wins.
  • 1990 – Parcells vs. Levy: the coach with Super Bowl experience wins.
  • 1989 – Reeves vs. Seifert: the coach without Super Bowl experience wins.
  • 1988 – Walsh vs. Wyche: the coach with Super Bowl experience wins.
  • 1982 – Shula vs. Gibbs: the coach without Super Bowl experience wins.

Again, no evidence that experience is helpful.

Understand, I am not making the case that experience doesn’t matter. I’m simply searching for evidence that it does
and failing to find much. I’m a null hypothesis guy in general, and I don’t see any reason to switch in this case. Everyone except Bill Belichick seems to assume that the experience is an advantage for the Pats (although what else is he going to say?). As far as I’m concerned, the burden of proof is on them.

Not that this is worth anything, but generating these lists alerted me to some similarities between this game and the 1997 Packers-Broncos tilt. The Broncos, like the Eagles, had been the best team in the weaker conference for a few years but had failed to make the big game. The Packers, like the Patriots, were the defending champs, had a genius head coach, and were a team without stars except for their quarterback. I was not particularly a fan of either team, but if I were forced to name one, I might call that 1997 Super Bowl the most exciting one I ever watched. I hope this one is similar.

Where Are the Latin Lefties?

OK, this is just a weird data anomaly that I cannot explain, and I would like to enlist your help. The response to the replacement-level question was great (I’m still digging through it all), so I thought I would try again.

In my post on Hispanics in the Major Leagues, I noticed that the percentage of Hispanic left-handed pitchers was about half that of non-Hispanics. I could not think of any explanation, and still cannot. So I thought I would check out batters as well. Surely, Hispanics would bat left-handed at a rate comparable to non-Hispanics? Wrong. Take a look at the distribution of lefties for both types of players:


Position Hispanic Non-Hispanic
Pitchers 15% 26%
Batters 14% 31%

What’s going on? Interestingly, Hispanic players are much more likely to be switch-hitters than non-Hispanics. And the added switch-hitters narrows the gap between batters who can bat lefty, although there are still fewer Hispanics batting left-handed.


Bats Hispanic Non-Hispanic
Left 14% 31%
Switch 18% 7%
Left or Switch 32% 38%

Maybe non-Hispanic switch-hitters are more likely to give up batting from the right side than Hispanics? This was my first thought, but the fact that the shortage of lefties occurs among pitchers as well leads me to think its something on the Hispanic side. In fact, due to the shortage of left-handed Hispanic pitchers, there are greater returns to becoming a switch-hitter if you are playing in Latin America (assuming the left-right ratio is the same as in MLB).

To quote Leftorium owner Ned Flanders, “As the tree said to the lumberjack, ‘I’m stumped.’” This has to be the most-hyphenated blog post in history.

Addendum: More discussion from Baseball Musings and Baseball Primer. Thanks to David Pinto and Repoz for the links!

Sabernomics Super Bowl Extravaganza

In honor of the coming Super Bowl, I’ve convinced Doug Drinen of Pro-Football-Reference.com to post some football-related posts in the spirit of Sabernomics. Doug and I have done a lot of work together on baseball, and he is the person responsible for introducing me to sabermetrics. I only live two blocks from Doug, which gives me easy access to his collection of The Bill James Baseball Abstracts from 1982 until the end. It’s good to finally get Doug posting here directly. Doug’s ideas often appear on this blog through me after a lunchtime discussion.

Welcome, Doug! I look forward to your posts.

The Manning Index

Last week, much was made of the fact that the Colts are 3-5 in playoff games started by Peyton Manning. Is Peyton a choker? I don’t think we’ve got sufficient evidence to make that claim, especially in light of the fact that the Colts have been the higher seed in only 3 of those 8 playoff games. In other words, the Colts’ postseason record in the Manning era is exactly what one would expect using an admittedly crude but very reasonable predictor. I thought that was interesting, so I decided to refine it just a bit and run that query for all the great past and present QBs.

First the refining.

I looked at all postseason games since 1978 and ran a logit regression (there’s the economics content in this post) with a win dummy as the output variable and the team records and game location as the inputs. For those curious, the formula is

Probability of winning = (1 + exp(-.43(windiff)-.24(homefield)))^(-1)

where windiff = the given team’s regular season wins minus its opponents’ regular season wins and homefied = 1 if home, -1 if road, 0 if neutral site. So, for example, the 14-2 Patriots taking on the 12-4 Colts in Foxboro would have a windiff of 2 and a homefield of 1, which translates to an expected win probability of .748. Now all that’s left to do is tally up every quarterback’s expected wins (which is the sum of the win probabilities for each game) and his actual wins, and sort the list:

                     Expected  Actual    Diff
-----------------------------------------------
Tom Brady                 4.5       8      +3.5
Joe Montana              13.7      16      +2.3
Trent Dilfer              2.8       5      +2.2
John Elway               12.2      14      +1.8
Troy Aikman               9.2      11      +1.8
Mark Rypien               3.4       5      +1.6
Jeff Hostetler            2.5       4      +1.5
Wade Wilson               1.7       3      +1.3
Brett Favre              10.2      11      +0.8
Drew Bledsoe              3.3       4      +0.7
Phil Simms                5.4       6      +0.6
Doug Williams             3.4       4      +0.6
Jay Schroeder             2.4       3      +0.6
Brad Johnson              3.5       4      +0.5
Jim Everett               1.6       2      +0.4
Donovan McNabb            6.6       7      +0.4
Steve McNair              4.6       5      +0.4
Jim Harbaugh              1.7       2      +0.3
Kurt Warner               4.9       5      +0.1
Rich Gannon               3.9       4      +0.1
Stan Humphries            3.0       3      +0.0
Mark Brunell              3.0       3      +0.0
Jim Kelly                 9.3       9      -0.3
Kerry Collins             3.3       3      -0.3
Vinny Testaverde          2.4       2      -0.4
Dave Krieg                3.4       3      -0.4
Bernie Kosar              3.5       3      -0.5
Mike Tomczak              3.6       3      -0.6
Peyton Manning            3.8       3      -0.8
Neil O'Donnell            3.9       3      -0.9
Kordell Stewart           3.0       2      -1.0
Steve Young               9.1       8      -1.1
Jim McMahon               4.2       3      -1.2
Randall Cunningham        4.2       3      -1.2
Dan Marino                9.4       8      -1.4
Warren Moon               4.9       3      -1.9

Fine print: the list includes all quarterbacks whose careers began in 1978 or later (hence no Terry Bradshaw or Snake Stabler) and played in at least five postseason games. A QB was credited with a game played if he attempted 10 or more passes in the game.

Just to be clear, I believe that teams — not quarterbacks — win football games, so I’m not claiming this is the One True Measure Of Clutchness. Whether I like it or not though, wins are credited to quarterbacks in virtually every discussion about quarterback greatness. This is merely a way of putting a quarterback’s win-loss record into perspective.

I hate to admit it, but the deification of Tom Brady is getting tougher and tougher to argue with. This metric overvalues him just a tad by giving him credit for the 2001 victory at Pittsburgh (Bledsoe was probably more responsible for that win), but still. The probability of going 8-for-8 in the specific collection of postseason games Brady has played in is .004.

What’s the Price of a Replacement Player?

In a thread on the Raul Mondesi signing at Baseball Primer the other day several posters commented that Mondesi projected to be a “replacement-level” player. Using Keith Woolner’s definition this means, that he projects to produce the output equal to “the expected level of performance the average team can obtain if it needs to replace a starting player at minimal cost.” In the minds of most, the viable replacements are in the minors. Maybe they’re not playing because they’re blocked on the big club and the GM wants him to gets some at-bats, I don’t know. These guys are cheap and provide the base-level of talent for MLB. From this benchmark we can value the contribution of individual players based on the contribution above a replacement player. The rule-of-thumb offensive measure of replacement level is 70 points of OPS below the league average at that defensive position. The statistic known as VORP measures the Value Over Replacement Level that any player provides.

But back to Raul Mondesi. For now, I’ll grant that this is an accurate prediction; let’s assume he possesses “replacement-level” talent. However, the next logical step in the argument in this thread goes a little too far, although it makes sense. Mondesi basically signed a $1 million 1-year deal. The minimum salary for a baseball player is $300K, which is the salary a team would be paying to player on the margin between the majors and minors. Therefore, the Braves are paying Mondesi about $700K too much.

But wait, who makes $300K? Largely, it’s players who are reserved by teams. For the first three years of Major League service, players make what teams tell them they make. This doesn’t mean team’s actually value these players equal to their salaries. In fact, in his classic study estimating the marginal revenue product (MRP) of baseball players (the amount a player’s performance contributes to team revenues) Gerald Scully estimates that monopsony extraction by teams was about 7/8ths of a player’s value that he contributes to the team. 7/8ths! This means that the value to the team of a reserved player is, on average, 8-times larger than the salary of the player. So, the actual price of a replacement player is actually $2.4 million? Well, not exactly. Scully’s paper was published in 1974 when the economic structure of the game was much different. All players were reserved by teams, and I’m not sure what the minimum salary was. I suspect that if not for the league-minimum salary requirement that the reserved players would be making less than $300K.

Regardless of exact magnitude of the exploitation, certainly we can say the that teams receive more in value from reserved players than the wage they pay out to these players. To acquire a replacement-level player from another team will require compensating the team with reserve rights for the value lost. Therefore, it is incorrect to say that the purchase price of a replacement-level player is equal to the league minimum. Raul Mondesi is not reserved, and therefore does not suffer from the monopsonistic exploitation of a particular team. He is going to receive more compensation for his services than a reserved player. The question is, with the exploitation removed, how much should he be paid for the services (MRP) he will provide? While I don’t have an answer, I have some ideas of where to start looking but have not thought it through. I would like to ask readers to lend me your suggestions in the comments section on a way to estimate the actual price of a replacement-level player.

Addendum: In addition to the comments below, there is now a thread on Baseball Primer with some very good stuff.

Pinto’s Probabilistic Model of Range

David Pinto of Baseball Musings has started releasing numbers on his Probabilistic Model of Range, which is a stat to evaluate defense.

The Moneyball Braves

Thanks to Brad at No Pepper, everyone in the baseball blogoshpere seems to have tacked onto Perfect Game’s interview with Braves Scouting Director Roy Clark. Based on one of Clark’s comments Tom at Balls, Sticks, & Stuff (via Offwing) makes an interesting observation regarding the Braves market response to a new market inefficiency.

Eager to copy what works, many teams have begun to focus particularly on college players in the draft. So what was once an undervalued commodity, is now possibly overvalued, and so by focusing on high school players, the Braves are taking advantage of what has become a market inefficiency.

Like the story of Billy Beane told in Moneyball, John Schuerholz (as well as all other good GMs) is always on the lookout for the next great bargain. To most Braves fans this is old news. Schuerholz has been the master at picking up formerly overrated veterans after they swing to being underrated (see Bonilla, Franco, and now Mondesi and Jordan). And what he does with pitchers under the watchful eye of Leo and Bobby no one else has yet to figure out. One Braves commentator often touts the Atlanta approach as proof that the sabermetric Moneyball stuff is garbage. I didn’t get it when I first heard it, and I still don’t. Does JS do things deferently than Beane? Of course, all GMs do. The other officially licensed Moneyball GMs — DePodesta, Ricciardi, and Epstein — do things different from each other, too. The point is, for the past few seasons the Braves have not gone after any standard big free agents, yet they keep winning with teams that look like losers to the masses on opening day. JS isn’t just stumbling onto these guys. His approach is unique and it works. And though it involves scouting, it is also clear that JS is quite stat savvy.

The James-Murphy Connection

Rich’s Weekend Baseball BEAT comments on an e-mail from Tom Meagher of The Fourth Outfielder. Meagher notices strange connection between the career of Dale Murphy and the rise and fall of the Bill James Abstracts. If Dale Murphy was your favorite ballplayer growing up (he certainly was mine), you’ll like this post.

Opening Day without a Mexican

My friend and former professor Bryan Caplan (he’s blogging at EconLog now) writes about an interesting movie that I’m going to have to see, A Day Without a Mexican.

Inspired by the “magic realism” common in Latin American literature, A Day Without a Mexican is a modern fable in which all of the Hispanics in California vanish overnight. (Why not call it A Day Without an Hispanic? One of the film’s recurring jokes is that Californians think that Mexico is the only country south of the border).

Much of the story traces the effects on California’s economy. Agriculture, construction, personal services, restaurants, and more fall to pieces. Families even find their beloved nannies are missing.

The great 19th-century economist Frederic Bastiat taught economics largely through this sort of thought experiment. What would happen to the economy if we blotted out the sun? Candle-makers would hail the higher demand for artificial lights, but Bastiat objects that this makes society poorer by frittering away valuable resources to make what nature gives us for free.

A Day Without a Mexican makes the same point. Without Latin American residents – legal or not – a few special interests benefit, but society loses. Californian agriculture might implode. But even if it attracted replacement workers with higher wages, society would have to give up whatever those replacement workers used to produce. It is far better for everyone to focus on their comparative advantage: for the Ph.D. in computer science to hire a less educated but perfectly competent nanny from Guatemala to watch her kids so she can return to work.

Though Bryan may be blinded by his borderline psychotic obsession with Bastiat, I have to admit that Bryan has pretty good taste in movies, so it’s definitely a must see. But I’m not writing about this to point to a good movie. I want to apply the same principle to Major League Baseball. What would the world be like if there were no Hispanics in the game? (Aside: Latino Baseball is the central clearinghouse for information about Latinos in the game. If you’re looking for stats and history of Latin players, go there.)

To analyze the role of Hispanics in MLB I used THE Lahman Baseball Archive, which lists the birth country of every player, to identify ethnicity. I identified players as Hispanic if they were born in Mexico, Central America, and South America. First off, let’s look at the total number of Latino players in 2004:


Player Total %MLB
Postion 326 24%
Pitchers 143 21%

The percentage of Hispanic players is about the same for pitchers and position players. So about a fifth of MLB players in 2004 were Latino. However, if I limit the sample to players who received some significant playing time (200 ABs for position players and 30 IPs for pitchers) the percentage rises to 28% for non-pitchers. So, Hispanic players make up about a quarter of regulars in the leagues. But how good are these players compared to non-Hispanics? My intuition was that the average Hispanic players in the majors would be a little better than average non-Hispanic. It turns out I was wrong.












Stat All Hispanic Non-Hispanic
AVG 0.273 0.275 0.272
(0.029) (0.031) (0.028)
OBP 0.342 0.334 0.345
(0.039) (0.040) (0.038)
SLG 0.443 0.449 0.440
(0.072) (0.079) (0.070)
ISO 0.170 0.175 0.168
(0.061) (0.064) (0.060)
OPS 0.785 0.784 0.785
OPS (0.103) (0.111) (0.100)

(Min 200 ABs; SD in parentheses)

There is not much difference in the distribution of Hispanics and non-Hispanics across many offensive skills. The biggest difference is in OBP, but it is still quite a small difference. This is not surprising though, because win-maximizing teams ought to evaluate playing talent without regard to ethnicity. It also calls into question the stereotype of a higher incidence of speedy defenders who can’t hit among Hispanics. It turns out there is the same percentage American “defensive specialists” out there — I’m not analyzing defense here, just making the assumption that weaker hitters are likely on MLB rosters by compensating with good defense. But, what about pitchers?

















Stat All Hispanic Non-Hispanic
ERA 4.43 4.20 4.50
(1.34) (1.51) (1.28)
BAopp 0.26 0.25 0.27
(0.03) (0.04) (0.03)
FIP 4.61 4.50 4.64
(0.98) (1.07) (0.95)
BB9 3.45 3.49 3.43
(1.17) (1.12) (1.19)
K9 6.74 7.28 6.60
(1.92) (2.08) (1.85)
HR9 1.13 1.12 1.13
(0.49) (0.55) (0.47)
Left% 26% 15% 29%
Relief% 54% 60% 52%

(Min 30 IP; SD in parentheses)

The pitching is a little different. Hispanics do seem to perform slightly better than non-Hispanics, but the difference is small. However, Hispanic pitchers seem to strikeout more batters. The biggest difference has to do with handedness and relief duty. The rate of Hispanic lefties is half the rate of non-Hispanics, and Hispanics have a higher percent of appearances as relievers. Why the shortage of Hispanic lefties? I certainly did not anticipate this. I wonder if it has something to do with platooning practices in Latin American leagues. Maybe lefties face a lot more righties and therefore don’t have the stats to get noticed by big-league scouts. Maybe the pay for being a LOOGY outside of MLB that young lefties concentrate more on hitting. I’m open to suggestions.

So, back to the original question. What would MLB look like if we arbitrarily removed a quarter of its talent? Obviously, the quality of the game would suffer. MLB could have the same quality of baseball by contracting 7-8 teams, dilute the quality of play with AAA talent, or a combination of both. The minors could get ugly at the low levels. Keith Lockhart might still have a job. I suspect fan interest would fall off quite a bit. But, that is not exactly what will happen. Athletes from other sports, who previously couldn’t make it given the previous required talent would also stock some of these teams. So, the talent levels of the NFL, NBA, NHL, and MLS would also fall.

I think there is an important lesson here. Just as a Hispanic-fee MLB would harm baseball fans, a Hispanic-free USA would harm US consumers. Immigration doesn’t just help the immigrants, it helps society as well. Bastiat’s lesson of the seen and the unseen is that it’s easy to see the immigrants working in jobs in that might be occupied by true-blue Americans, and though it’s hard to see the positive benefits to society they not only exist but swamp the costs. We tend to put too much weight on the costs we can see — immigrants replacing American workers (like Keith Lockhart) — and not enough on the benefits that are no so obvious — cheaper and better products (baseball) for Americans (fans).