Pitcher Performance with Normalized Offense
For anyone that missed the fanpost set up by elijiahdukes, LINK, Sky, Ryan and I were having fun with numbers. This got me to thinking about probability of the Rays scoring a certain level of runs. If we know these probabilities, then we can make a guess as to how many runs we can allow, and still get the win.
|
For those that are still with me, one of the reasons I like playing with stats, is because I am intrigued by quantifying different types of pitchers. Take the Shields v. Kazmir debate. I love this one a) because I am a huuuge Scott Kazmir fan, and b) because they are assumed to be polar opposites of one another. Most people look at Scotty Dangerous as the prototypical "sky is the limit" type of potential. This is due to his good fastball, filthy slider, developing change, and it doesn't hurt that he is a lefty. James Shields is, in a lot of ways, the opposite. Never hyped, but always puts up consistently good numbers. Not flashy, with his league average fastball and the guy uses changeups to strike people out, can you believe that?
Back on subject, earlier today, I put together a chart, A. that you can find below. Basically it is a frequency chart showing how many different games we scored a total of 0-15 runs. The events column is 162 minus the times column to show the inverse, or how many times we scored more than that amount of runs. Pe shows this in percent form as the probability that we would score that many runs, and R/IP breaks this down to an inning x inning basis. Note, I did not include extra innings into my equations. Ultimately, this tell us, using 2008 data, the probabilty of scoring a certain level of runs per game.
That gives us the offensive side, but the one or two of you that are still here are scratching your heads saying, "Well I thought this was about pitchers, you liar." Well this is where it gets fun. I have also gone through and collected Earned Runs, and Innings Pitched for Kazmir, Garza, Shields, and Sonnanstine on a start x start basis. From here a R/IP column is easily created. In fact if you go HERE you can follow along without cluttering up this limited workspace. Now using Chart A. below we can look at certain thresholds of Runs Allowed vs. Runs Scored per inning. For example, Kaz allowed 0 runs seven times last year. Meanwhile, we scored more than 0 runs 95.7% of the time or 155 times. This means that for those 7 goose eggs that Kaz put up he should have received roughly 6.7 Est. Wins. We can do this for each threshold until we reach a point where you get to a start where there is no statistical possibility we could have won the game. In this case it is the start where Kaz gave up 9 runs in 3 innings, incidentally, I was at this game and it was the worst experience of my life. He allowed 3 R/IP. The most we scored in a game all year was 1.67 R/IP. Therefore we can now total all of our Est. Wins to reach the figure you saw on the front page, or 13.93 wins. That is how many Scott Kazmir should have won based on his performance, normalizing our rate of scoring.
I have done this for all the pitchers if you were smart enough to click on that link that said HERE, try that one, if you feel up to it. Basically, this gives you yet another way to value pitchers. It is also handy, because it shows the value of someone that is in and out of a lineup compared with a guy that is a steady performer. I plan to work on this for the rest of the AL East tonight so if you liked this I will be attempting to post that sometime tomorrow.
BIG UPS TO BASEBALL-REFERENCE.COM
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
20 comments
|
0 recs |
Do you like this story?
Comments
E-Jax?
Check out my blog on (mostly ColdFusion, but some PHP) web development at kericr.wordpress.com
I have Texas done and just finished Toronto so I guess I can throw him in.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 8:14 PM EDT up reply actions
He's in and it's about what you'd expect
Some very good, and some piss poor. With the Rays offense and Edwin on the hill, they had a 38% chance of winning the game. That would be the lowest figure in the chart so far
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 8:41 PM EDT up reply actions
Overall this is really impressive.
The Jackson comparison is a really fun little s&g thing that gives some good insight as to exactly how much worse he was then the other four starters. Hammel has a lot to live up to, I’m certainly hoping he’s not up for this particular challenge.
Check out my blog on (mostly ColdFusion, but some PHP) web development at kericr.wordpress.com
Thanks
I’m just glad someone else can make sense of it, wait til you see the O’s if you think Edwin is funny.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 9:54 PM EDT up reply actions
I guess my follow-up question would be:
How different does this look with a game-by-game FIP instead of ERA? I would say tRA as well,but that’s a ton of work, FIP is basically three data points.
by R.J. Anderson on Mar 20, 2009 10:05 PM EDT up reply actions
This is Earned Runs and Innings pitched by game
I’m sure FIP could easily be put in. I’m just cherry picking my data and entering it in, but I could do the same with the FIP components after I finish up the AL East to see how that looks.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 10:09 PM EDT up reply actions
Oh yeah, you're just cherry picking to make Sonnanstine look good, maybe if you used real earned runs ... oh.
If you don’t mind, I might run the game-by-game tRA//FIP totals and send them your way. Not sure if you’d be interested in that, but this is pretty cool. After we get that data together, perhaps produce some probability charts?
by R.J. Anderson on Mar 20, 2009 10:12 PM EDT up reply actions
Actually, I may not do tRA, that's a lot stickier than FIP and B-Ref gamelogs don't include IFFB.
by R.J. Anderson on Mar 20, 2009 10:19 PM EDT up reply actions
I may not do this at all actually.
I don’t think it’s going to lead to what I was originally thinking.
by R.J. Anderson on Mar 20, 2009 10:28 PM EDT up reply actions
If you had time it would save me from having to collect all the FIP data, but if you don't
don’t sweat it.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 10:29 PM EDT up reply actions
2 weeks til the season so I'll probably pick at this here and there until then.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 20, 2009 10:30 PM EDT up reply actions
Collecting the game-by-game data is easy.
Either go to the gamelog on all the player pages or go to their B-Ref page, save it as a webpage, then import the data.
http://www.fangraphs.com/statsd.aspx?playerid=4897&position=P&season=
Just copy —> paste as special —> text
by R.J. Anderson on Mar 20, 2009 10:32 PM EDT up reply actions
Confirms my (and probably most people's) intuition
that going less that 6 innings per game whacks a lot off the value of Kazmir and going closer to 7 adds to Shields’s value tremendously. The similarity in est. wins of Sonnanstine to Garza and Kazmir is surprising though.
I probably would have cut off the outliers though. How much difference would Scotty’s est. wins change if we erase that one debacle you mentioned? (Taking away one shutout on the other side as well)
I thought about cutting outliers, but I didn't want to make it too subjective
I think what “whacks” a lot of Kazmir’s value was that he only made 27 starts last year. His estimated wins would be relatively unaffected as a 0.00 win probability was applied to that start, and since the Est. Wins is a sum stat then the result would be the same. Taking away 0 R/IP would have an effect. Without changing the data, I would venture that if Kaz made 6 more starts, with 2 being good, 2 mediocre, and 2 bad he would have easily added 2 wins to that total making him virtually a 16 win guy and putting him somewhere between Shields and Burnett. Thank you for taking the time to comment.
As for Sonny, I think if you look at his chart you will see that basically 1/3 (11) of his starts he allowed less than 3 ER/9 or .33 ER/IP. Add in another 8 where he allowed, essentially, 4-5 ER/9 and you can see where the bulk of his value comes from. Keep in mind that according to this data that even when he gave up 5/9, the Rays had a 33% chance of winning the game.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 21, 2009 12:27 AM EDT up reply actions
Normalizing
The charts in the other thread showed that runs scored wasn’t normally distributed, so how can you accurately normalize scoring? Or is the distribution normal enough? Or do I not know what I am or you are talking about?
Perhaps I should have just asked how you normalized scoring.
Normalizing might not be the correct word but in it's simplistic form
If a pitcher gives up a rate that would lead to 3 Runs Against, what is the probability that the Offense will put up 3 or more runs. It is just a way to use theory to see how many wins a pitcher should have had. If you have any suggestions on how to improve the model I am all ears as I would like to improve this as much as I can. I think you answered your initial questions and hopefully this helped as well.
Do what you love to do and give it your very best. Whether it's business or baseball, or the theater, or any field. If you don't love what you're doing and you can't give it your best, get out of it. Life is too short. You'll be an old man before you know it.
-Al Lopez
by Sandy Kazmir on Mar 21, 2009 2:33 PM EDT up reply actions

by 




















