Mid-Season Refresher: Statistics and the Question "Why?"
We love statistics here at DRaysBay, but we know how confusing they can be. Everyone is familiar with the old adage: "There are three kinds of lies: lies, damned lies, and statistics." History isn't clear on who first coined the phrase - although Mark Twain popularized it in his autobiography - but that's beside the point anyway; the most important thing about the phrase is that...well, it's true. Statistics are darn tricky things that can mislead, misinform, confound, and confuse even the most rational, knowledgeable people. If you have an opinion, you can most undoubtedly find statistics somewhere to back it up. And because of this, many people choose to ignore statistics or are skeptical of arguments that use them.
And you know what? That's good! People misuse statistics all the time and so it's great if you don't automatically believe everything you hear. But once you start to question statistics, then you're left with a big problem: what do I believe? Should I believe that Carl Crawford is our team MVP this season, or should I believe it's Rafael Soriano? Should I believe that Jason Bartlett is still a valuable player on this team and has gotten unlucky this year, or should I believe he's washed up? The world isn't as black-and-white as many people want you to believe, and the truth can normally be found if you take both sides of a coin and pick a spot somewhere in the middle. If you want to answer these questions for yourself, though, it's helpful to know how to use statistics correctly.
When used improperly, there's nothing more confusing than statistics; when used properly, though, there's nothing more enlightening. Imagine that somewhere out there, behind all the uncertainty and confusion of life, there's a golden "Truth". Every player has a True Talent Level, and every question has a True Answer. Statistics attempt to find that Truth: measuring things, stripping away the biases in our perceptions, and leaving us with small little pieces of the Truth. No one statistic will ever show you the Truth; heck, the best statistics can do is give us brief glimpses of the Truth. It's like putting together a giant jigsaw puzzle with each statistic providing one little piece. Life is uncertain and ever changing, so you never have all the pieces and you never finish the puzzle. But, if you put enough pieces together and look at the puzzle from the right angle, sometimes you can almost see the answer. Almost.
That's so important, it's worth stating again: no one statistic will ever show you the truth. Not ERA, not BA, not FIP, not wOBA, not WAR. Every statistic shows you one important piece of information; it's a matter of knowing what exactly that statistic is telling you and what it's not telling you. And so, every statistic - even the traditional ones like ERA, Wins, and BA - has its purpose. Great analysis starts with a statistic and the question "Why?" James Shields has a 4.93 ERA - why is that? Is he letting up more hits? Striking out less batters? Giving up too many homeruns? Is he pitching poorly or getting unlucky? Why?
And so, to help reduce confusion and increase understanding, over the next couple of days we'll be doing a brief refresher course on some of the statistics we use frequently on the site. Please ask questions along the way. We don't want people to be turned off by our analysis or to be confused by the numbers we're using, which is why we created The Sabermetric Library and link to it on a daily basis. Newfangled baseball stats may seem confusing, but the theory behind them is as easy to understand as the traditional statistics. Every statistic has its use, and the more you understand, the better you'll be able to tell if an argument is putting the puzzle pieces together correctly or not.
28 comments
|
0 recs |
Do you like this story?
Comments
That's a bit harsh.
sf1 has the argument that “We can always find stats to prove our point” but he also uses them in his arguments. As someone who is on the fringe of these stats, I always find it worthwhile when these primers are put out. Knowing what stat to look at and when is challenging, but rewarding once you start to understand.
As you can always expect come from behind victory is when you least expect it.
Believe it or not, this site
has been a pleasure for me, even as an old traditional approach to baseball that i had upon finding it. However, i think you nailed it, or if i’m not grasping what you said, forgive me. People here, RJ, did i say that, can find a stat to build up or tear down every single player in the history of MLB. That being said, i still find their usefulness and thouroughly enjoy them
At least you know what arbitration & stuff is
I turned on 620 WDAE to see if there was anything interesting & lost 10pts off my IQ for the week
PIZZA?!?
by Transplanted on Jul 12, 2010 4:10 PM EDT up reply actions
I rarely listen anymore.
They really don’t know what they’re doing over there in regards to baseball. Zero saber knowledge, minimal minor league knowledge, shoddy major league knowledge; it’s just mind numbing to listen to on any kind of regular basis.
It's almost shocking that anyone could be THAT bad for a sports station
PIZZA?!?
by Transplanted on Jul 12, 2010 4:20 PM EDT up reply actions
I called in once to discuss possible trade scenarios and named some minor leaguers
like Moore and Barnese as maybe being pieces for a trade for Lee. There was just silence followed by “…uh well I’m not familiar with those names.”
Is that Giant wind bag "big Dog" still there.
Sorry moved out of state, but still try to get as much sports coverage as possible on the radio. And Michael Kay here in NY makes my ears bleed.
I never used to listen to him at all
but now that 1040 put their clown on at the same time, I end up listening to 620 or music. Too bad you can only pick up 1010 when you’re sitting in their studio because their lunch program is the best sports talk in Tampa Bay.
Now, now, Duemig has some saber knowledge
I turned it on a few weeks ago, and he was talking about it… more specifically, I think he was reading a Wikipedia entry about what it was. He’d read a line, then give a dramatic pause for a few seconds, then read some more. Occasionally he’d end a sentence with his voice emphasizing certain words, his voice rising in pitch as if he was accusing someone of a horrific crime.
Accusing what, I don’t know. He offered no analysis, no criticism. He just kept reading from whatever web site he had pulled up and when he finished, he sounded exhausted, like he finished a closing argument in a murder trial.
But again, all he did was READ DEFINITIONS of things like sabermetrics and PECOTA. But he sounded pretty satisfied by the end that he had just completely destroyed whatever merit or worth such tools have.
Then I think he changed the subject to why Joe Maddon is a horrible manager.
Every time he engages in a conversation about Sabermetrics
He calls it ridiculous and takes individual stats out of context to prove his point. He then talks about the good ol’ days and smokes his corncob pipe.
Which is the point of paying attention to this.
Stats’ strengths and flaws will likely be addressed. I’m sure they will also address how to more accurately get an idea of a player’s talent level using a combo of stats. This will really help with your whole “you guys have a stat for everything” hangup.
listen up? i say it daily no one stat tells you anything unless
it’s something like K, HR, BB etc
I agree...
Although I might call those things facts rather than statistics.
The raw data is really indisputable by definition.
Mangling the raw data into numbers (statistics) that attempts to have high correlation with a players “true value” is really the science and sometimes art form.
The real value of statistics comes down to predictability. How can I tell what the player is going to do next? That is why many raw figures are not that great like wins. To the extent that a pitchers win total is a predictor of a team’s likelihood of winning the next game, it is valuable. Turns out wins is not great.
What I am really interested in, which I have never seen, is understanding full team stat trade-offs for things with diminishing returns. Basically, to what extent should players not be evaluated individually.
For example, let’s say I have a team of all .250/.350 batters, is there a point where substituting a .275/.325 batter would produce a higher expected run output than another .250/.350 batter.
What I am really interested in, which I have never seen, is understanding full team stat trade-offs for things with diminishing returns. Basically, to what extent should players not be evaluated individually.
These things are out there. Look at any trade evaluation on Fangraphs, BTB, or here. Its mentioned how many “Wins” the upgrade is worth given projected performance.
Also, predictability isn’t the only reason that stats should be used, or its only real value. Seeing what a player has done is just as substantial as what a player might do. Obviously this depends on how you wish to use the stats. Thats why stats like batting average and ERA aren’t all that bad. It still describes the performance of a player and the value given to his team. Description of past performance is a “perfect science”, the numbers are there in their raw form and can be easily understood. Prediction on the other hand is the imperfect science that still needs lots of work.
Those trade evaluations are always done using individual stats
Unless you are looking at different ones than I am. They basically take the difference of the WARs between the players traded. That is not what I am talking about. A player does not operate in isolation no matter what the stat junkies would like to believe. I personally believe that the Rays have too many high K / high BB guys who have good WARs and value, but that value diminishes with each additional high K / high BB guy you add. There would need a be a sense of curves showing how high BB rate does show diminishing returns on a per player basis, or high K rates increase the negative greater over time, something like that. This isn’t just a matter of running correlations.
And, I am sorry, but you are patently wrong. Prediction is the only thing that actually matters. Future performance is what you pay for and care about, so prediction accuracy is the only real currency.
Looking at past performance may be fun, but it is completely unimportant except as a mechanism for prediction. If I could get a better indication of future performance by just measuring height and weight for instance, past performance could essentially be thrown out (I know this isn’t actually true).
Look at the stock market as an example. Some companies trade at 50x their earnings, others trade at 10x their earnings. I guarantee you that the companies that are trading at 10x have better “past performance” (compare MSFT & NFLX) and yet are less valuable because their future performance looks worse. Baseball players are no different than illiquid financial instruments.
Can the O's possibly continue to screw up their season even more?
They’re only 1G in front of Pittsburgh for worst record in 2010 & may lose out on getting Rendon in 2011.
L O L the O’s
PIZZA?!?
You know what I’d like to know? Why Baseball Reference and FanGraphs have 2 different numbers for WAR. What’s the difference in the calculation and opinion on which is better.
B-R uses a different defensive metric I think
PIZZA?!?
by Transplanted on Jul 12, 2010 6:10 PM EDT up reply actions
They use total zone, whose own creator said that it's inferior to UZR
But more useful because it goes back before even 2002
by benderbrodriguez on Jul 12, 2010 9:59 PM EDT up reply actions
I know FG uses UZR and BR uses something else, I can't remember.
"It doesnt really matter what I think anymore." - Kevin Kennedy
"That's so important, it's worth stating again: no one statistic will ever show you the truth. "
I really think this needs to be heeded, especially in regards to FIP. I constantly see FIP used as a sole reason why a pitcher is good/bad. It’s gotten to the point that it’s almost impossible to see a post criticizing/complimenting a pitcher without an empty FIP reply in an attempt to refute the post.
My only issues regarding sabermetrics is some people using luck as a crutch
It actually falls into the misusing statistics blurb that you posted. I know what they are trying to say, and generally I just ignore it, but it is really pretty inaccurate for the most part.
On another topic I’d like to see people break down into components of certain stats a bit more. Many of these excellent advanced stats have quite a few variables.
For example lets look at FIP. Lets assume his FIP is less than his ERA. Typically people just scream unlucky. Fair enough.
But why is his FIP lower? More strikeouts or what?
If so, should he have more strikeouts? What about the components that make up strikeouts? Is he getting lucky/unlucky on that front?
Go Gators!!



























