What causes GIDPs?
After the thread about our pitchers' GIDPs, I got to wondering what "caused" GIDPs. With all the talk of Bradford and his crazy ground balling ways, I decided to look at how certain factors effect a team's GIDPs. So, using some of Basevall Prospectus's fantastic tools I set up a little analysis.
I wanted to look at every team from the past 5 years and run regression analyses to see how much three factors (groundball rate, pitcher strike-out rate, and defensive efficiency) effected a team's likelihood of turning a double play. I collected DP% (the rate at which teams convert double plays in all oppurtunities), K%. and Def_eff for all the teams. Then, I ran the regression. Surprisingly, to me at least, none of the factors held a strong enough relationship to account for differences in teams' DP% (which ranged from the 2006 Nats at 9.7 % to the 2005 Cards at 17.5%). So, I took a look at the data.
I noticed something that seemed way off. The GB% year by year averages went 53.73, 45.66, 45.28, 45.12, and 45.33 (for 2004-2008 respectively). I don't know what the hell was going on in 2004, but this number was throwing everything off. After taking out 2004, the analysis was much more in line with what I expected.
GB% and K% had a statistically significant (at 95%) relationship with DP%. Def_eff was close to significant, but not quite. The coefficients for GB%, K%, and def_eff were .27, -.25, and .18. This finding, though, was inconclusive, to me, because it included a variable that did not, in this sample size of 124 teams, seem to be statistically significant. Here is how the relationship between the variables plays out, though, with def_eff in the mix:
DP%= -.072 + .269*GB% + (-.248*K%) + .175*def_eff
This tells us, that in a vacuum as a pitchers GB% goes up, K% goes down, or def_eff goes up we expect DP% to go up.
As I said I was unhappy with these results, I ran another regression with only GB% and K% as the variables. This time the relationship with the two variables was even greater. The t-score (for stat-geeks) increased for both. Our new model for predicting DP% looked like this:
DP%= .038 + .276*GB% + (-.203*K%)
So, what does all this mean? Well, the expected outcome that groundball pitchers are much more likely than any other pitchers (especially more than strike-out artists) comes out to be true. As it pertains to the Rays, this year is our best year for DP% despite the fact that our projected DP% is lower this year than any of the last 4. The difference between our expected DP% (11.55%) and actual DP% (13.5%) is sixth out of the 124 teams. This does not surprise me having seen the effectiveness of our turning the double play.
One other thing. Looking at our relievers and theirexpected GIDP% (w/o accounting for our defense), it's fairly obvious just how effective Bradford is in situations like this.
Per skyking, all the pitchers and how they fared in each aspect.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Data for 2008 teams using RZR
| TEAM | GB% | SO Rate | Fielding | DP% | p1 | Skill 1 | p2 | Skill 2 | p3 | Skill 3 |
| ATL | 49.60% | 16.87% | 40.1 | 12.20% | 14.19% | -1.99% | 14.15% | -1.95% | 14.20% | -2.00% |
| TOR | 47.20% | 19.60% | 37 | 13.60% | 13.04% | 0.56% | 12.93% | 0.67% | 13.01% | 0.59% |
| SLN | 46.90% | 14.70% | 31.4 | 13.60% | 14.00% | -0.40% | 13.84% | -0.24% | 13.96% | -0.36% |
| LAA | 45.30% | 17.29% | 29.9 | 14.10% | 13.00% | 1.10% | 12.88% | 1.22% | 13.01% | 1.09% |
| OAK | 42.50% | 17.61% | 21.1 | 13.60% | 12.34% | 1.26% | 12.04% | 1.56% | 12.22% | 1.38% |
| LAN | 51.00% | 19.64% | 21 | 12.10% | 13.84% | -1.74% | 13.97% | -1.87% | 13.87% | -1.77% |
| PHI | 46.10% | 16.88% | 20.7 | 12.40% | 13.34% | -0.94% | 13.18% | -0.78% | 13.26% | -0.86% |
| HOU | 43.90% | 17.38% | 10.3 | 12.60% | 12.50% | 0.10% | 12.47% | 0.13% | 12.57% | 0.03% |
| COL | 49.00% | 16.23% | 8.7 | 14.20% | 13.93% | 0.27% | 14.11% | 0.09% | 14.06% | 0.14% |
| TBA | 41.50% | 18.67% | 8.6 | 13.50% | 11.85% | 1.65% | 11.55% | 1.95% | 11.70% | 1.80% |
| BOS | 45.30% | 19.03% | 7.2 | 12.20% | 12.59% | -0.39% | 12.52% | -0.32% | 12.55% | -0.35% |
| CHN | 42.60% | 20.47% | 3.6 | 11.00% | 11.70% | -0.70% | 11.49% | -0.49% | 11.56% | -0.56% |
| CHA | 47.20% | 18.84% | 3.5 | 15.00% | 13.07% | 1.93% | 13.09% | 1.91% | 13.04% | 1.96% |
| SEA | 45.20% | 16.07% | 3.4 | 12.50% | 12.96% | -0.46% | 13.09% | -0.59% | 13.14% | -0.64% |
| DET | 44.20% | 14.86% | 1 | 14.20% | 13.24% | 0.96% | 13.06% | 1.14% | 13.15% | 1.05% |
| MIL | 47.70% | 17.80% | -0.1 | 13.00% | 13.54% | -0.54% | 13.43% | -0.43% | 13.37% | -0.37% |
| CIN | 44.70% | 19.28% | -4.6 | 11.10% | 11.87% | -0.77% | 12.31% | -1.21% | 12.30% | -1.20% |
| PIT | 45.60% | 14.65% | -4.9 | 14.40% | 13.33% | 1.07% | 13.49% | 0.91% | 13.51% | 0.89% |
| CLE | 45.80% | 16.07% | -8.8 | 15.30% | 13.25% | 2.05% | 13.26% | 2.04% | 13.24% | 2.06% |
| SDN | 44.70% | 17.84% | -8.9 | 12.90% | 12.56% | 0.34% | 12.60% | 0.30% | 12.59% | 0.31% |
| SFN | 40.70% | 19.67% | -9 | 11.20% | 10.96% | 0.24% | 11.12% | 0.08% | 11.21% | -0.01% |
| WAS | 44.30% | 16.68% | -10.9 | 11.50% | 12.74% | -1.24% | 12.72% | -1.22% | 12.73% | -1.23% |
| BAL | 46.10% | 14.57% | -18.3 | 12.70% | 13.79% | -1.09% | 13.65% | -0.95% | 13.60% | -0.90% |
| ARI | 47.00% | 20.33% | -21.7 | 11.40% | 12.61% | -1.21% | 12.73% | -1.33% | 12.57% | -1.17% |
| FLO | 43.00% | 17.78% | -23.7 | 10.40% | 12.19% | -1.79% | 12.14% | -1.74% | 12.13% | -1.73% |
| NYA | 46.80% | 18.52% | -25.7 | 12.10% | 12.78% | -0.68% | 13.04% | -0.94% | 12.90% | -0.80% |
| KCA | 42.90% | 17.23% | -27.6 | 13.00% | 12.16% | 0.84% | 12.22% | 0.78% | 12.21% | 0.79% |
| TEX | 44.30% | 14.37% | -28.7 | 15.10% | 13.04% | 2.06% | 13.19% | 1.91% | 13.16% | 1.94% |
| NYN | 45.10% | 18.44% | -31 | 9.80% | 12.78% | -2.98% | 12.59% | -2.79% | 12.47% | -2.67% |
| MIN | 43.60% | 15.55% | -37 | 13.30% | 12.78% | 0.52% | 12.76% | 0.54% | 12.70% | 0.60% |
This shows the GB%, SO%, Fielding rating for infielders, the predicted (by the three measures) and actual DP%, and the skill implied (s1, s2, and s3). The new coeffecients are .24, -.2, and .0000392. The effect of RZR looks small, but that's because this is the first timeit isn't a percentage.
9 recs |
20 comments
Comments
thread rec'd
great stuff.
the way stats providers categorize GB/LD/FB has changed a bit over the years. a good way to account for that is to normalize GB% on a yearly basis — create a new field which is GB% relative to average (try absolute % diff or maybe a ratio).
anyone who does multi-year zone rating analyses will find this issue, too. positional-average zone ratings change a LOT for some positions year after year.
how about posting the starters’ data and comparing the output of your regression equation to actual GIDP% this year?
by Sky Kalkman on
Sep 4, 2008 5:27 PM EDT
reply
actions
0 recs
GB/LD/FB has changed a bit over the years
That’s what I figured. I didn’t think to do the absolute % because the difference between the lowest and the highest from 05-08 was a half a percent.
by rglass44 on
Sep 4, 2008 5:34 PM EDT
up
reply
actions
0 recs
I'll take a look at the starter's data
But I’m leaving work soon, so I doubt I’ll have it done today.
by rglass44 on
Sep 4, 2008 5:34 PM EDT
up
reply
actions
0 recs
One think I want to look at is the top and bottom teams
See if there are repeat offenders that seem to completely botch the process.
by rglass44 on
Sep 4, 2008 5:36 PM EDT
reply
actions
0 recs
edited, but it was being stupid so this got left out
I have a lot of data for this, and plenty more to say. I am tired of rambling though. so I will give you guys the link to my data and turn it over to you guys for questions, comments, concerns.
by rglass44 on
Sep 4, 2008 6:10 PM EDT
reply
actions
0 recs
a thought on including DER
DER includes a lot more information than infield defense, most obviously outfield defense. but also home ballpark (foul territory, shorter fences, etc).
I’d suggest using zone rating converted into runs of the four infield positions (or maybe just SS/3B/2B?) instead of DER. i’d use Justin’s averaged STATS and BIS fielding data availab at his blog. plug into Excel and make a quick pivot report summing the fielding column by team and excluding the unwanted positions.
by Sky Kalkman on
Sep 4, 2008 8:00 PM EDT
reply
actions
0 recs
GIDP% compared to expected GIDP%
possible causes for the difference:
- (in)ability to induce groundballs in DP situations (clutch)
- luck
- fielding abilities outside of what’s measured by zone rating (ability to turn the pivot, especially)
- inaccuracy of fielding metrics included
- is probably small, #2 large, #3 medium, and #3 small
what do you think?
by Sky Kalkman on
Sep 4, 2008 8:05 PM EDT
reply
actions
0 recs
#2 is probably key
I wish I could get this data for Tinkers-Evers-Chance and see what that looks like.
by rglass44 on
Sep 5, 2008 9:20 AM EDT
up
reply
actions
0 recs
What are birds?
We just don’t know
by Top Gun Numba 1 on
Sep 4, 2008 10:53 PM EDT
reply
actions
0 recs
wow, love that you went through all that.
no time now but i’ll look closer later. thanks!
by Sky Kalkman on
Sep 5, 2008 3:04 PM EDT
up
reply
actions
0 recs
random comment...
i didn’t realize James Shields’ GB% was so high — 46.7%. i thought he was an extreme flyball guy. oops.
by Sky Kalkman on
Sep 5, 2008 5:34 PM EDT
reply
actions
0 recs
also, can you explain in more detail what p1 and Skill1 are? (and 2, 3)
by Sky Kalkman on
Sep 5, 2008 5:34 PM EDT
reply
actions
0 recs
been out of town watchin my Deacs take down the Rebels of Ole Miss
p1 and s1 are the predicted outcome and skill (actual minus predicted) taking into the first three variables, GB, k, and der. p2 and s2 are the same values, but they don’t take into account der because it wasn’t statistically significant at the 95% level.
by rglass44 on
Sep 8, 2008 12:16 AM EDT
up
reply
actions
0 recs














