Extra Base Hits on Balls in Play and Pitcher Skill

A few weeks ago, John Dewan focused on a stat that I hadn’t thought about in a while as his Stat of the Week: Pitcher OPS Allowed (OPSA).

For hitters, for years and years, it was batting average that was thought to be the best single statistic to look at to evaluate a hitter. In the last couple of decades, the weaknesses of batting average have been exposed and the value of getting on base and hitting for power have become better recognized. The stat that is becoming the new standard for hitters is OPS—On-base percentage Plus Slugging percentage.

For pitchers, the standard is ERA. Compared to batting average, it provides a much better representation of effectiveness. It measures the most important quality of a pitcher’s job, preventing runs. However, it too has its flaws. The biggest flaw is that a pitcher’s ERA can be greatly affected by the pitchers that immediately follow him in a game, both positively and negatively.

Enter Opponent OPS. This is a stat that you hardly ever see. It makes just as much sense to look at Opponent OPS for pitchers as it does to look at a hitter’s own OPS. We just recently added this as a leaderboard titled “Opponent OPS” to Bill James Online and I wanted to share it with you.

ERA is going to continue to be the standard, and I will personally look at ERA for every pitcher, but I think Opponent OPS may be a better indicator of a pitcher’s overall effectiveness.

I used OPS Allowed to proxy pitcher quality in a study of hit batters with Doug Drinen. I’m not sure why we settled on the metric, but I haven’t used it in some time. There are two reasons for this. OPSA is heavily influenced by balls in play, and it is difficult to compute with available data. You basically have to reconstruct it from play-by-play data. But, OPSA holds some potentially useful information not contained in traditional DIPS metrics. So, when I read Dewan’s post I decided to look into the stat a little deeper.

This summer I’ve made a commitment to become more familiar with Perl so that I can better dig through the play-by-play data that is becoming increasingly available. So, I viewed this as an opportunity. Armed with Learning Perl and Doug’s old Perl scripts, I was able to gather some very specific pitcher-allowed stats from Retrosheet event files from 2000–2007.

First, I want to look at the correlation of pitcher performance from year to year. Here are the correlations for pitchers who pitched more than 100 innings in back-to-back-seasons.

Metric R2
AVGA 0.208
OBPA 0.206
SLGA 0.194
OPSA 0.179
ERA 0.120

Pitchers performance in OPSA is more consistent from year to year than ERA. This is an indicator that OPS captures more skill than ERA, as skill ought to be repeatable from season-to-season. Performances in other metrics are also better, but not by much. Even compared to ERA, OPSA’s correlation isn’t that much stronger. About 18% of a pitchers OPS is explainable by his previous season’s OPS, while 12% of a pitcher’s ERA is explainable by his previous season’s ERA.

That OPSA is more highly correlated from season to season than ERA is not surprising. ERA suffers from two deficiencies. First, scoring rules that assign a pitchers culpability for runs that were jointly allowed. Second, ERA includes performance on balls in play, which is heavily polluted by luck and defense. OPSA, though it also includes balls in play, includes walks and weights home runs more than other hits, both of which are defense-independent outcomes.

Next, I break down pitcher performance into components: strikeout rate (per batter-faced), walk rate, home run rate, batting average on balls in play, and doubles-and-triples-allowed average on balls in play (XBABIP).

Metric R2
K 0.603
BB 0.492
HR 0.153
XBABIP 0.042
BABIP 0.038

Pitchers do appear to have more control over the type of hits allowed on balls in play than they do over hits in general; however, the difference is small. Furthermore, pitchers have far more control over defense-independent metrics than balls in play.

But, even if pitcher control in this area is small, is there any additional information to be gained by knowing a pitcher’s XBABIP? The next table reports the marginal impact of the previous variables on predicting future ERA.

Variable (one-year lag) OPSA DIPS Only DIPS & OPSA DIPS & XBABIP
K - 7.02521 - 6.68551 - 7.03101
[11.27]** [8.81]** [11.31]**
BB 3.43275 3.05492 3.27137
[3.27]** [2.65]** [3.12]**
HR 12.05949 9.88836 11.4197
[4.27]** [2.50]* [4.04]**
OPSA 3.09165 0.41422
[9.67]** [0.78]
XBABIP 3.99536
[2.48]*
Constant 2.05883 4.93117 4.65531 4.66884
[8.50]** [29.48]** [11.95]** [23.63]**
Observations 932 932 932 932
Adjusted R-squared 0.09 0.16 0.16 0.16
Absolute value of t statistics in brackets
* significant at 5%; ** significant at 1%

First, look at the R2–actually, it’s the “adjusted R2“, which makes some corrections to raw R2for bias induced by adding additional variables–and how they change in each model’s estimate of ERA. Neither the addition of OPSA nor XBABIP adds much explanatory power over the DIPS-only model. BABIP is excluded because it is never statistically significant.

Next, I use the same independent variables from the previous season to estimate OPSA in the present season.

Variable (one-year lag) OPSA DIPS Only DIPS & OPSA DIPS & XBABIP
K - 0.74367 - 0.59477 - 0.73719
[14.87]** [9.79]** [14.94]**
BB 0.22356 0.05451 0.18811
[2.65]** [0.59] [2.25]*
HR 0.54351 -0.28027 0.46318
[2.39]* [0.94] [2.06]*
OPS 0.3062 0.16671
[12.12]** [4.24]**
XBABIP 0.65966
[5.24]**
Constant 0.52537 0.84578 0.73255 0.80116
[27.40]** [62.61]** [24.51]** [50.67]**
Observations 934 934 934 934
Adjusted R-squared 0.14 0.21 0.23 0.23
Absolute value of t statistics in brackets
* significant at 5%; ** significant at 1%

In terms of predicting the future OPSA of a pitcher, knowing his OPSA or his past propensity for allowing extra-base hits on balls in play–recall that this excludes home runs–improves the explanatory power of the model. This is evidence that pitchers do have some ability at preventing extra-base hits on balls in play.

Finally, let’s look at the magnitude of the impacts. The following table lists the absolute changes in ERA and OSPA from a one-standard deviation change in the variables, based on the coefficients in the DIPS & XBABIP models.

Variable ERA OPSA
K - 0.32087 - 0.03364
BB 0.07658 0.00440
HR 0.09103 0.00369
XBABIP 0.05431 0.00897

In terms of predicting ERA, all of the DIPS metrics have a larger impact than XBABIP; however, for OPSA, a one standard deviation change in XBABIP has a larger impact than a pitcher’s walk and homer rates.

What does this tell us? Again, if you want to know something about a pitcher’s skill at prevention runs, you can learn a lot from his defense-independent performance. The metrics will tell you more than his ERA or OPSA alone. However, knowing how a pitcher prevents different types of hits does add some useful information about a pitcher’s skill, unlike BABIP. If you happen to have XBABIP handy, feel free to use it to evaluate a pitcher’s talent. But if you don’t have it, you don’t lose much by ignoring it.

PS — Sorry about the spacing issues. I used HTML tables, which causes WordPress to insert extra spaces. I usually use pre tags, but I have not been able to get them to work since upgrading. I don’t think it’s necessarily a WordPress problem; instead, it’s probably something that I am doing that WordPress is interpreting differently than I intend. If you have any suggestions, please pass them along to me.

Problem fixed. I had a plug-in turned on that didn’t play well with others.

2 Responses “Extra Base Hits on Balls in Play and Pitcher Skill”

  1. Josh says:

    I don’t really have anything to add or a specific question just wanted to say I found this interested. I didn’t see a comment so thought I’d say I appreciated it since I’m sure it took a little time. Keep up the good work JC, thanks.

  2. You can fix the tables by deleting the carriage returns after each row (in the HTML). You will notice that the number of extra spaces corresponds to the number of rows in the table. Blogspot does this too.