Introducing: The Sabermetrics Library
[Note by Tommy Rancel, 02/22/10 7:11 PM EST ] Bumped for those who missed it this morning.
If you've been keeping your ear to the ground in the sabermetric community, you're probably aware that last Monday, John Sickels had a few choice words to say about the direction of sabermetrics these days. If you didn't read the article, you should; it's thought provoking and makes you stop and think for a bit, which I always find fun. For those of you that haven't, though, I'll summarize: Sickels basically feels that sabermetrics has gotten "granular", reaching a level where advances in research requires incredibly complicated math, yet only achieve marginal improvements over already established truths. He sums up his argument in his last paragraph:
"So am I just entering my dotage prematurely? Or is advanced sabermetric analysis becoming so specialized that no one but physics and math majors can understand it, leaving us humanities majors behind, let alone the average fan? If that is true, what can be done about it? I don't mean stopping research; obviously it needs to go forward. But I mean, how do we find ways to disseminate the new knowledge and make it comprehensible for the non-math folks among us? How do we integrate and explain the new knowledge?"
As a fellow humanities major, I can understand and sympathize with where Sickels is coming from. I certainly don't consider myself a sabermetrics researcher by any means and whenever I venture over to The Book Blog, I find myself coming away with a sore brain. I can understand all of the current statistics conceptually, but I shudder whenever I think of the actual math involved behind them. But once I admit that I'm not a researcher and never will be, well, then what am I? In a community so research-focused, if I'm not a researcher, what role do I have? And hey, I consider myself very well versed on current baseball research, so if even I can't understand these statistics fully, then how can we expect other people to jump on board and take us seriously?
All of this got me thinking of something Sky Kalkman said in an interview here at DRaysBay a bit less than a month ago. When asked about how he sees the sabermetrics community evolving the future, Kalkman answered:
"I also think there will (should?) become more of a dichotomy between the crunchers and the writers. With traditional baseball writers, you've got the reporters and the analysts. Many try to wear both hats, but we all know when someone is outside their element. As sports media changes -- and the changes are only accelerating -- I think we're going to see specialized roles become more rewarding (TMZ does just fine without any ability to write, for example.) Saber crunchers will provide the substance (think Fangraphs or Colin Wyers) and the writers will take that stuff and entertain us (think Joe Posnanski or Dave Cameron). Not that saber writing has to be numbers based. It's really about the concepts."
While this post wasn't meant as a response to Sickel's words (goodness knows there have been enough of those already), I guess it's evolving into one. Anyway, I like to think that as a humanities major, my role in this community is that of a writer. I educate, I entertain, I analyze, I bridge gaps, I write. And there's nothing wrong with that. It's tough at first, but the big step is admitting to yourself that it's okay that you're not a researcher. It's okay that you're not going to further baseball research in a new way that no one has before. Heck, without good writers like Joe Posnanski, I'd never be as fascinated by sabermetrics as I am today. Instead of being a researcher, I get to do something just as exciting: educate others.
And so finally, I arrive at the point of this post. I don't think it's entirely a coincidence that while Sickels was saying, "How do we integrate and explain the new knowledge?" last Monday, our readers here at DRaysBay were informing us that we need to make advanced statistics more accessible to them. In pondering over both these incidents, I think they point to a larger problem within the sabermetric community: that it's incredibly research and analysis focused, with little emphasis on writing and education. I realized that one of the biggest gaps missing here on DRaysBay (and in the sabermetric community at large) is a handy reference tool to help new users learn about sabermetrics. Sure, we have a stat guide here at DRB, but the stat guide will only take you so far towards fully understanding the statistics. Sure, there's lots of good information out there if you Google search or browse through sites, but why should it be that hard? Why should all the good research and educational information be scattered piecemeal throughout the internet? It doesn't make sense and if we want to take that next step as a community, we're going to need to become more accessible to the everyday fan. Those who are dedicated and interested in advanced statistics are going to take the time to search for the information they need, but the everyday fan isn't going to be convinced to take sabermetrics seriously unless we bring the information to them.
To help solve this issue, with the help of Andy Hellicksonstine and rglass44, I spent all of last week compiling The Sabermetric Library. I'm sure it's not perfect, but the idea behind the site is to provide websites with an easy-to-use and easy-to-link website that they can refer new users towards. You'll notice that there is a separate page for almost every statistic you can find on FanGraphs, with each page containing a brief, no-numbers-involved description of the statistic, some numbers and/or charts to help provide context, a couple bullet points of key things to remember when using said statistic, and links to other relevant websites and articles. Here at DRaysBay, every time we use a statistic for the first time in an article, we will be linking to the specific page for that statistic over at the Library.
Since this website has been put together quite quickly, please feel free to leave any suggestions in the comments below. I'm sure some pages will need editing to make them easier to understand, while there may be other statistics that I should include and other links that I should add. Heck, my explanations for some statistics may need tweaking to make them conceptually correct. Please explore the site and don't be afraid to let me know if a page sucks. I want this to be a useful reference for everyone and the only way to know that is if I get feedback. I don't see this site as an end result, but as a starting ground. Thanks, all.
Thanks again to Andy Hellicksonstine and rglass44 for their help and contributions. And thanks to Lookout Landing for this compilation, from which I shamelessly stole a ton of links. Also, my apologies to Tango - I didn't realize that they'd talked about a "Sabermetric Library" recently on The Book Blog until after I registered the domain name.
PS- A couple pages are in the process of editing, but should be finished within a day or so.
8 recs |
55 comments
|
Comments
Hat tip to Steve for spearheading this important project
And Andy and Glass for their help.
I agree with Sickels in his point that new stats are constantly being created, but are they better than the advanced metrics we already have? It’s this exact reason that I haven’t jumped on the SIERA train yet, because I haven’t been convinced it’s that much better than FIP or one of the other defensive independent stats.
Back to the “library” hopefully this bridges the gap between the saber-friendly and the non-saber groups as we strive to become a more well-rounded community. As Steve said feel free to leave any questions or additions you would like to see addressed and we’ll do what we can. We’ll try to refer back to library as often as we can when speaking about the more advanced stuff.
Great job.
www.draysbay.com, www.beyondtheboxscore.com, Twitter @trancel
by Tommy Rancel on Feb 22, 2010 7:51 AM EST via mobile reply actions
Yeah, I felt Sickels backlash was mostly with SIERA because it doesn't seem to be a better predictor than xFIP/tRA
while being wholly more complicated.
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 22, 2010 1:24 PM EST up reply actions
I don't know the formula for either, but the point of it is that B Pro feels that they have to have something of comparable
value out there with their name on it so that their subscribers feel like they are paying for something that will make them smarter than other fans. I don’t think that’s the case with SIERA, and makes me tend to align with Sickels, in that, if it’s a very small marginal increase, what’s wrong with what we already have?
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 22, 2010 1:50 PM EST up reply actions
Couldn't the same be said of tRA in regards to xFIP?
xFip is monumentously easier to compute, and the leap from tRA to SIERRA is pretty minimal. the difference is how they are constructed. WHile tRA attacks the issue from a linear runs standpoint, SIERA is based on a regression analysis. Neither are easy to compute, so it’s more a matter of which methodology you prefer (or it seems which school you prefer, BP vs. LL/The Book/etc.). Personally, I don’t really care for either because they’re too hard to compute, but I do use tERA because it’s on FG. I doubt I’ll use SIERA much b/c I no longer have a BP membership.
Got any input to the tougher but better yield issue?
Do you see SIERA’s flaws being largely an issue with the way it’s derived (strict regression model) or merely that it doesn’t offer much more? From my understanding it tests better than tRA which had previously set the bar in year over year correllation.
Any thoughts?
Well
It’s going to correlate to next year’s ERA better than tRA does just by virtue of ignoring home runs entirely and not being park-adjusted. So, yeah, it’s a better predictor (don’t know how it stacks up against tRA* though).
I just feel that SIERA takes the ‘this makes sense’ out of our pitching statistics. What can we learn by dissecting it? It’s not built on a logical model, and so I don’t think it educates us. It’s just a number – but a very predictive one, to be sure. So it’s a philosophical thing for me, I suppose.
Well
As you guys may have noticed, I’m currently trying to teach the entire foundation of sabermetrics piece by piece on LL (we’re on part 10 right now, steaming towards the idea of replacement level by tomorrow). I think that a lot of our problem with people not getting the new stats is that they don’t have a good basic understand of the ideas that they’re built upon.
So it’s kinda fun to go through game state, run expectancy, run estimators, and then by the time they get to wOBA they’ll know why it makes sense to attack the problem of offensive value that way. A lot of the things I’ve been going over are trivial little pieces to the puzzle, but when everything comes together it’s going to be really interesting.
SIERA will never ever make sense to think about logically, and that means it’ll be very frustrating to teach. I think that’s the problem people have with it. If they’d approached it in a different way – say, converting xFIP to BaseRuns, we’d be able to use it as an educational tool as well as just for pure number crunching.
I generally agree with you, but i think the leap from xFIP to tRA is pretty significant, while the leap from tRA to SIERA is less so
I have not spent enough time looking at the newer metric, but it seems like a slight upgrade, at best, and the same at worst.
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 22, 2010 3:38 PM EST up reply actions
Isn't the fact that the worst case is that it's just as good a reason to favor it?
Especially when the upside is that it does a better job at predicting ERA (suggesting it does a better job estimating skill, at least on the major league skill scale) than xFIP? It can’t simply be ease of calculation, otherwise why not use simple RA (and I often do)?
by Tommy Bennett on Feb 22, 2010 7:09 PM EST up reply actions
SIERRA is interesting
It takes some getting used to for the sake of analysis. It’s easier to explain the difference between an xFIP and ERA. However if it likes Matt Garza, I’m down with that.
Follow Me on Twitter @FreeZorilla
Much tougher calculation
FIPs you can do roughly in your head. Tango seemed to conclude that it was a worthwhile statistic alongside the existing DIPsty-doodles rather than as a replacement.
Follow Me on Twitter @FreeZorilla
If that's what you think
You probably don’t read enough Sickels
Bad Left Hook - The SB Nation boxing blog
"Baseball is played on the field, not on a calculator."
this is...
….awesome.
and I fully agree about the need for not only number crunchers but people who can write convincingly, clearly and passionately about what the numbers mean. I’m a scientist (not a mathematician or statistician) and have found that clear and concise explanation of my work has always sharpened my analyses – not only because it helps me get back to the language behind the science (the real-world scenario behind whatever question I’m trying to answer) but also that explaining work to an outsider is the best way to find where there are still gaps in my results and assumptions. It’s not an easy thing to do, but is immensely rewarding.
Thanks, guys!
by proveyrdifferent on Feb 22, 2010 7:56 AM EST reply actions
Terrific.
I haven’t looked through it all yet, but was very happy to see that you identify a scale to indicate (for wOBA, for example) what is average and what is excellent or poor. For traditional stats, such as BA, everyone assumes that .300 is good, .250 or less is poor and .350+ is excellent, but it is not yet understood what the ranges are in many advanced stats. I hope you do that for all relevant stats.
I also like that you compare the value of different stats that seem to measure similar performance issues. Well-done, although while you do indicate why wRAA is better than wRC by providing context for total offensive contributions, I don’t see a similar comparison of wRAA to wRC+ which also provides context.
That was quick!
Great job to the 3 of you. I briefly looked at a few and they seemed to touch on not only how th calculate, but in simple terms what the stats mean and even at what point they have meaning. Well done. Thanks for grabbing the ball and running with it Steve.
Follow Me on Twitter @FreeZorilla
Fantastic
I can’t wait to peruse through the different sections and be able to wade in knee deep as opposed to the delicate testing of the waters that I did last year. Good job to all involved.
Damn, the post-RJ era at DRB is already six times better.
WE DIDN’T NEED HIM ANYWAYS.
/stuckonyou
Mira Sorvino...Paul Walker...T-Pain...Fall 2010...HEADSTONE MAFIA, A LOVE STORY OF REVENGE. "5/5 stars!!!" - DRB User "Andy Hellicksonstine"
Bookmarked
Thanks for the time and effort everyone involved for helping the rest of us out.
This is a fantastic help.
I’m glad to see a groundswell of solidifying all of the rapid research that’s been done. While the researchers keep pushing ahead, it’s important to make sure the foundation is both solid and understandable. Thanks!
Steve, I fudged up the tERA column of the continuum, those should be lower than the tRA numbers
Here’s the corrected numbers:
2.16
3.02
3.37
3.62
3.91
4.39
4.69
4.97
5.35
6.07
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 22, 2010 3:47 PM EST reply actions
Okay, sounds good
I’ll update that.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 4:27 PM EST up reply actions
This is absolutley great guys
as a reference point for veterans and as a guide for newer members. My one request would be adding PMR to the defensive statistics, a bit was made of it as a defensive metric last year, and a series was done on BtBS using PMR, but I’m unfamiliar with how its derived, who keeps it, and what its purpose should be.
Baseball Musings keeps it.
I don’t know much else though.
by R.J. Anderson on Feb 22, 2010 4:25 PM EST up reply actions
I debated about throwing that in there as well when creating the site
I think UZR and Dewan +/- are the most widely used, but I’ll probably add it sometime this week. Check back on the site come Friday. I don’t know much about it off the top of my head except that it stands for Probabilistic Model of Range, so I’ll have to do some research.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 4:27 PM EST up reply actions
PMR is great
But I believe Pinto discontinued it because of the cost of licensing the data.
by Tommy Bennett on Feb 22, 2010 7:10 PM EST up reply actions
Mm...that would be a problem.
Thanks for the info.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 7:31 PM EST up reply actions
WHERE'S WIN SHARES???!!?!?!
[/Bill James]
Seriously though, great work. You getting this linked up to the other sabermetrically inclined sites?
Bad Left Hook - The SB Nation boxing blog
"Baseball is played on the field, not on a calculator."
If they want
I haven’t approached anyone about it, but I figure if other sites want to use it that’s great. I’m not going to stop anyone else from using it as a reference.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 4:32 PM EST up reply actions
LL and USSM both linked to it.
I’ll do the deed on FG.
by R.J. Anderson on Feb 22, 2010 4:50 PM EST up reply actions
On the third day
RJ appeared to his apostles, Rglass,RZ, Sandy, Slow, and the two ex-MVN’s
http://citrusjuicing.com/ An SRQ focused-Tampa Bay area sports blog
by CubFanRaysaddict on Feb 22, 2010 10:18 PM EST up reply actions
Library
Steve, thank you very much for what you have done here. This is what I have been hoping for on DRaysBay and the site will flourish because of it. You are a good man.
I added a link to the library in the Reference Materials on the left side of the main page
Follow Me on Twitter @FreeZorilla
Cool...good idea.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 10:06 PM EST up reply actions
Why do I find this so amusing?
"" *Yes, you can have a negative WAR. In fact, according to FanGraphs, the worst WAR any player has had since 2002 is Neifi Perez from the 2002 Royals. His -3.1 WAR eclipses the second place finisher, Yuniesky Betancourt from the 2009 Royals (-2.2 WAR). Oh, those Royals…""
"A life is not important except in the impact it has on other lives."
Jackie Robinson
"People ask me what I do in the winter when there's no baseball. I'll tell you what I do. I stare out the window and wait for spring."
—Rogers Hornsby
I had a lot of fun with that one
And by the end of all the stat profiles, I finally figured out how to spell Yuniesky Betancourt’s name. I think he was dead last in just about every stat last year.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 22, 2010 10:06 PM EST up reply actions
Those wacky Cubans and their crazy Visigoth names
Bad Left Hook - The SB Nation boxing blog
"Baseball is played on the field, not on a calculator."
I didn't see SLGCON in the library
Now, I don’t know if this is a new or old stat or one used a lot. I just know I hadn’t seen it until the annual with (s)Andy’s article.
True, I didn't include that.
I don’t think it’s a terribly common thing, but it can’t hurt to add it. I’ll see what I can do this week.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 23, 2010 9:34 AM EST up reply actions
I can write that one up if you want Steve, though RZ is the one that kind of turned me on to it
He might be better, but I’ll see what I can come up with. Also, BW, you’ll notice that I gave a quick primary in there on how to figure it and what it means.
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 23, 2010 9:48 AM EST up reply actions
Yea, I did notice, and fwiw I enjoyed the article
just thought that housing it in the library may not be a bad idea if you guys are trying to make it an all-encompassing place for sabr.
Yeah no doubt, I wasn't trying to sound like a dick in pointing that out, so I apologize if you took it that way
There’s so much out there, and a lot of it is different people attempting to show essentially the same thing (VORP vs. wRAA), that if you have any other suggestions to add, then please voice them. We don’t know what people want to see unless they say so.
Be peaceful, be courteous, obey the law, respect everyone; but if someone puts his hand on you, send him to the cemetery.
by Andy Hellicksonstine on Feb 23, 2010 10:27 AM EST up reply actions
That'd be excellent - you definitely know more about it than I do.
No rush obviously, but I’ll put it up whenever you send it along.
I love Casey Fossum. Now try and take me seriously.
by Steve Slowinski on Feb 23, 2010 10:33 AM EST up reply actions

by 























