Reviewing PrOPS

About a year ago, I published an article on my PrOPS system in The Hardball Times Baseball Annual 2006. While the article did get some positive media attention (here, here), I occasionally run across skeptical comments from baseball fans. Some people feel the system hasn’t been tested, but that’s incorrect. The fact is, the PrOPS formula is derived from observed correlations in past data. And I reported the results of how well the formula captures over/under performance in the article.

There is a highly statistically significant relationship…between a player’s over/under performance and his decline/improvement. And the greater the the deviation between PrOPS and OPS, the larger the reversion is the following season. For every 0.01 increase/decrease in a player’s over/under performance, his OPS is likely to fall/rise by 0.008 the following season. For example, a player with an OPS 10 “points” above his PrOPS, can expect his OPS to fall by eight points in the following season. That is quite a reversion.

I also generated lists of the top-25 over and under performing season from 2002-2004. And what happened to them?

Of the top 25 over performers, 20 players had lower OPS in the following season.

Of the top 25 under performers, 21 improved their OPS in the following season.

The article also lists the top-25 over and under performers for 2005. What happened to those players in 2006?

Of the over performers, 12 players declined, 7 improved, and 6 did not deviate more than 20 OPS-points from the previous season. Of the under performers, 11 players improved, 7 declined, 3 had no change, and 5 didn’t garner serious playing time. It’s not an air-tight projection system, but there seems to be some information there.

PrOPS is not a stand-alone projection tool. You should not look only at a player’s PrOPS and assume it’s exactly what the player should be doing. When I look at it, I also consider the player’s recent hitting history, injuries, aging, and all that other stuff we sometimes use to evaluate hitters. But when I see a player have a career year, and his PrOPS don’t show it, I start to get suspicious.

If you’re curious about the over/under performers of 2006, see The Hardball Times.

NL over performers
NL under performers
AL over performers
AL under performers

16 Responses “Reviewing PrOPS”

  1. tangotiger says:

    We talked about this at the time. Are you still doing GB to FB *ratios*, or GB rates? Ratios are not symmetrical, and whether you do GB per GB+FB, or FB per GB+FB, you should end up with the same result. Ratios don’t do that.

  2. Tom G says:

    I’ve always been surprised PrOPS doesn’t get more attention. Besides the uses you mentioned, I also use it to help me judge performances from small sample sizes.

  3. Kostya says:

    Is there any value that can be added to PrOPS by using speed scores of some sort? Or is this already taken into account in the underlying stats which PrOPS uses?

    In any case, if you want to dispel comments about the “untestedness” of PrOPS, a simple correlation study would do the trick reasonably well. Show the r^2 from year to year for OPS, PrOPS, as well as a 3-2-1 weighted average of OPS and PrOPS. If PrOPS beats OPS in both measures, then I think by and large people will be much more confident in it.

  4. JC says:

    I experimented with it several ways, but the ratios fitted the best.

  5. JC says:

    From the THT Annual 2006:

    OPS explained approximately 43% of the variance in OPS in the following year, while PrOPS explained about 46%.

  6. tangotiger says:

    That’s an r of .68, and inline with the rest of forecasting systems:
    http://lanaheimangelfan.blogspot.com/2006/12/more-projection-stuff.html

    The limit of a forecasting system is described here:
    http://www.insidethebook.com/ee/index.php/site/comments/forecasters_how_accurate_can_they_possibly_be/

    ***

    As for using the ratio, then how to justify using GB/FB instead of the reverse? What you are saying, by using GB/FB is that the higher the GB is more important than the lower FB. That is, let’s say the GB/FB has a mean of 1.00. A GB/FB of 2.0 is the same as a FB/GB of 0.50. But, using the GB/FB as the ratio has double the impact of FB/GB, even though they are describing the exact same thing.

    Just because something best-fits better on the sample doesn’t mean that it’s the right thing. A best-fit analysis would give the run value of a double .66 and the single .52 (instead of the more true .77, .47).

  7. JC says:

    As for using the ratio, then how to justify using GB/FB instead of the reverse?

    There is nothing wrong with using the reverse. The coefficient will change accordingly.

    Just because something best-fits better on the sample doesn’t mean that it’s the right thing. A best-fit analysis would give the run value of a double .66 and the single .52 (instead of the more true .77, .47).

    That should read a mis-specified best-fit analysis. I don’t understand how pointing out that it’s possible to mis-specify a regression sheds any light on the situation. I’m certainly aware of this, and I always bend over backwards in an effort to avoid such problems.

    In any event, any mis-specification in this model, and the corresponding omitted variable bias that would result, would apply when including GB and FB as a ratio or if I included only one.

    All else being equal, I’ll pick the model with the better fit.

  8. tangotiger says:

    Ah, but the coefficient will not change accordingly. What will happen is this: mow the guys with the highest FB/GB ratio will move *more* than the high GB/FB ratio players.

    Think of it in an extreme situation: you have a guy with 100 GB and 1 FB. In your current PrOps, this guy has a 100.00 value, which you multiply by some coefficient, say “.002″. So, he moves +.20 points up. If on the other hand you used FB/GB ratio, your coefficient may be “-.002″, which multiplied to 1/100 (or .01) will be zero.

    From where I sit, using GB/FB taints your process whereby the higher the GB, the more impact than the higher the FB.

    If you create a FB/GB version of PrOps, show your results both way (old Props, new Props) for Frank Thomas and Derek Jeter, and you will see the impact of this bias.

  9. tangotiger says:

    I just ran three different regressions, using GB/FB ratio, FB/GB ratio, and GB/(GB+FB) or GB rate. This was ran against GPA on the THT site. (The use of GPA, or OPS, etc, doesn’t really matter.) I used 2004-2006 data of all players with at least 502 PA.

    The 2006 Frank Thomas is the most extreme, with a FB/GB of 2.44. His resulting regression yielded results of: .287, .313, .298.

    At the other end is the 2004 Ichiro, with a GB/FB of 3.55. His results are: .247, .261, .255.

    The sample standard deviations are: .0057, .0072, .0071

    In all cases, the mean was .276.

    GPA is analogous to batting average. Those are some HUGE differences, don’t you think?

  10. tangotiger says:

    The correlation coefficients were (r) were .21, .26, .26. And, it should go without saying, that using FB/(GB+FB) produced the exact same estimated GPA for each player as the GB rate, as well as the exact same r.

  11. JC says:

    Ah, but the coefficient will not change accordingly. What will happen is this: mow the guys with the highest FB/GB ratio will move *more* than the high GB/FB ratio players.

    This is incorrect. The coefficients (including the constant) will adjust in magnitude, not just in sign. That’s how multiple regression analysis works.

    EDIT: Text Removed
    [Part of a comment was marked as spam, so I was commenting on something that was previously confusing.]

    In any event, I don’t mean to be rude, but I’m tired of discussing this with you. There is an opportunity cost to refining things, and I’m done with PrOPS until the data improves significantly. If you don’t see any value to it, don’t use it. I’m comfortable with it. I welcome you, or anyone else, to generate your own version/improvement. I have no doubt that the system could be improved, and I would be happy to see it done rather than quibble over minor details of the model that are irrelevant.

  12. J. Cross says:

    JCB,

    I may have posted about this before but looking at the top Props-OPS guys I wonder if the shift is playing a role and whether left handed sluggers tend to under perform their PrOPS. Is this something you’ve considered?

    Jared

  13. J. Cross says:

    Oh, the other thing I’d be interested in is whether there’s any y-to-y correlation in PrOPS – OPS.

  14. JC says:

    Didn’t control for handedness or shifts, so it could be an issue.

    PrOPS correlate more highly from year to year than OPS, about .1 more in terms of R2.

  15. tangotiger says:

    My post #9 clearly shows that it makes a huge difference for the extreme GB and FB hitters, if you use GB/FB or FB/GB. However, there is no change whatsoever if you use GB/(GB+FB) or FB/(GB+FB).

    There is no justitication for using one ratio (GB/FB) over the other (FB/GB), even though they absolutely give you different results. In fact, you are not even justifying it. Just deciding to use it.

    There is zero opportunity cost to changing from ratios to rates, since I was able to generate 3 different regression equations in 5 minutes.

    Your thread started with a comment about fans being skeptical, and here I am, giving you a thoughtful and legitimate beef, and you are dismissing it with “if you don’t like it, don’t use it”.

  16. JC says:

    No, post number 9 shows that if you take a completely bastardized version of my model (GPA = f(GB, FB)) and look at the extremes, that the predicted outcome fluctuates for two players. And your mini-model is likely suffering from massive omitted variable bias (something you were worried about on my behalf earlier). Your fluctuations could show something important OR they could reflect the fact that when you don’t include line drive, walk rates, etc. the model is extremely sensitive to the change in specification.

    You are right that there is zero opportunity cost to using another method if it is superior. I spent numerous hours judging specifications based on many factors. Usually, I do not like to use ratios as variables, but I found that when I did, the model predicted better and helped me avoid some collineariarity problems. I didn’t make the choice to piss you off, nor do I continue to do so for this reason. I haven’t merely dismissed your argument. I responded, and when you kept pressing an incorrect point, I decided it wasn’t worth continuing the discussion.

    I make mistakes all the time, and I am always willing to admit them and change my mind. After all, it was you who pointed out to me a few weeks ago that a 10% growth in salary was in fact sustainable, and I changed my mind to agree with you. You were right, and I would be stupid to disagree with you on that. The reason I’m not bending here, is that you’re not telling me anything useful.

    We can argue all day about what is the proper specification to use. It’s very easy to be the guy in the audience who says “did you control for X or Y?” I could do an infinite number of things to this model. I waded through all of the possible models, weighed the costs and benefits of different specification, and ultimately chose the one that I felt to be the best. I took the time to develop this and publish this with my name and reputation on the line. I don’t have the time to go back and show what happens every time someone things one little thing could make a big difference. No one does.

    This is why it’s necessary for someone to take the next step and make PrOPS obsolete. There’s not much to be gained by obsessing over the minutia of potential imperfections in the model. Quit bitching about LPs and invent the CD.

    All I’m saying is “here’s PrOPS, here’s how it predicts, let’s see someone do better.” You’ve got the data, you’ve got your own idea as to what ought to be done. Why not give it a shot? Generate TtOPS. If it predicts better than PrOPS, I’ll use it with glee. That’s how progress happens.