The Creation of ePPA: Estimating P/PA Given Count Information

With the wealth of information available at the MLB level, P/PA (pitches per plate appearance) data can be easily queried through your favorite database. (I currently use a combination of MySQL, MATLAB, and R for all of our analytics here at Diamond Charts.) However, at the collegiate level, detailed P/PA data must be derived from the pitch count. Sounds simple, but don’t forget that two-strike foul balls aren’t accounted for in a standard play-by-play. Therefore, in order to provide a slightly more accurate P/PA estimation, we’ve created ePPA (estimated P/PA) to account for those foul balls that occur with two strikes. Methodology is below.

Assumption: NCAA hitters foul off as many 2-strike pitches as MLB hitters. (Later we can turn this assumption into a hypothesis and test its validity.)

Utilizing MLB PBP data from 2002-2012, we find 1.003M PA’s with two strikes prior to the action pitch. During these PA’s we find an actual P/PA of 5.11 while observing a count-based P/PA (simply, balls + strikes + 1) of 4.62. The plot below displays the distribution of total strikes encompassing all 1.003M 2-strike MLB PA’s over the past decade. As we can see, approximately 70% of PA’s with 2 strikes have 2 total strikes (that is, they have 0 fouls on 2 strike counts). However an adjustment clearly remains necessary for the other 30%. In case you’re curious, the most number of total strikes seen in one PA in the past decade occurred in 2004 in Los Angeles as Alex Cora of the Dodgers homered off Matt Clement of the Chicago Cubs in an 18 pitch at-bat.

2strikes_hist

One step further:
In order to more accurately find an ePPA that approximates more closely to the actual P/PA, we have broken these numbers down into the four possible 2-strike counts. The table below displays the actual P/PA with the count based P/PA (cPPA = balls + strikes + 1) and the difference between them.

balls_ct P/PA cPPA diff
0 3.243 3 0.243
1 4.370 4 0.370
2 5.533 5 0.533
3 6.747 6 0.747

The chart makes sense, the more balls/pitches a hitter sees, the more opportunities he has to foul off a 2-strike pitch. Thus, for every 2-strike PA, we add the differential factor based upon the number of balls in the count, providing a slightly more accurate representation of ePPA (ePPA = cPPA + diff). Because P/PA is a cumulative statistic (one that should be analyzed over a large data set), these simplistic approximations will suffice.

Effect:
2-strike counts make up approximately half of all plate appearances, thus providing an overall addition of approximately 0.24 P/PA for each hitter (a bit more for more patient hitters, a bit less for the aggressive). Thinking differently, on the order of 10 foul balls/game (per team) occur with two strikes.

Comments are closed.