clock menu more-arrow no yes mobile

Filed under:

Rays Roundtable: Our favorite statistics

We’ve got WAR, wins, xwOBA, and many more

MLB: Seattle Mariners at Houston Astros Shanna Lockwood-USA TODAY Sports

Welcome back to Rays Roundtable — pull up a chair. Throughout the season, we will be gathering the DRB minds-that-be together for brief chats about the Rays. These might be light-hearted or more analytical, but they will (hopefully) always be entertaining. Part of the beauty of baseball is the variety of opinions it draws out, and this is a good chance to see some of those differing opinions right next to each other in short, digestible segments. Today’s topic: Our favorite baseball statistics, inspired by this article here.

Ian Malinowski: The best thing about xFIP is that it works. The second best thing about xFIP is that people absolutely hate that it works.

Voros McCracken created Defense Independent Pitching (DIPS) theory way back in 1999 when he observed that pitchers seemed have little control over the eventual outcome of the balls put in play against them. Tom Tango, working inside the DIPS framework, created FIP by throwing out all those balls-in-play and estimating what a pitcher’s ERA should have been based only on strikeout rate, walk rate, and home run rate. Over a sample size of a full starting pitcher season or two, that limited view of pitching outcomes predicted future ERA better than did real-world current ERA. So Dave Studeman took it a step further. He threw out the individualized home run rate and calculated FIP with just strikeout, walks, and a league average home run rate, and found . . . that it works even better than FIP, especially in smaller sample sizes.

Nowadays it’s possible to out-predict xFIP. Both SIERA and DRA do it, as do the pitching projections on the better regression-based projection systems. But all of these systems are complicated, while xFIP is brutally simple. Those other systems work very hard, and they improve on xFIP a very small amount.

People can’t stand this, including otherwise reasonable baseball people with strong statistical foundations. They want xFIP to fail. They want to see a relief pitcher giving up a string of home runs and to be able to confidently say that he’s bad now and is going to continue to be bad, and xFIP doesn’t allow them to do that. The root mean squared error doesn’t lie.

There’s a lot more precise data available to us now than there was 10 years ago, but xFIP still works so well, while being so simple, because it reveals a fundamental truth about baseball—all outcomes have some degree of inherent randomness, and some outcomes have more of it than others. It’s my favorite baseball statistic because it’s a constant remind for us to be humble, to doubt our eyes and our convictions, to test our predictions, and to consider baseball with an open mind, because if we’re clever and lucky we might learn something.

Darby Robinson: My first introduction to stats beyond what you’d find on the back of a baseball card or on Baseball Tonight was BABIP.

BABIP was my gateway stat.

BABIP stands for Batting Average on Balls In Play. A simple and fun way to understand BABIP (and the way I learned it) was from DRB and Fangraphs great Bradley Woodrum, and his awesome animated videos on the “luck dragons” that control BABIP.

Here’s my wildly too simplistic breakdown of BABIP. The stat tells us how many times a ball in play goes for a hit, excluding home runs. League average BABIP can fluctuate year-to-year, but for the most part .300 is an average BABIP. A player with a BABIP much higher than .300 is probably getting lucky, and a player with a BABIP much lower than .300 is probably getting unlucky. Over time, regression to the mean should occur and players running lucky will cool off, and players that are unlucky will start to improve.

What makes BABIP such a fun stat to me is how simple it is at the surface, and much more complicated when you dive deeper. For example, certain factors can have huge affects on a player’s individual BABIP, like hitters who hit more line drives and hard ground balls will have a higher BABIP than players who hit a lot of fly balls (and conversely, pitchers will have a lower BABIP against if they give up more fly balls than grounders). Hard contact, especially, is a way to improve your BABIP without just luck. If all you are hitting is weak pop-ups, and your BABIP is super low, you’re not unlucky: you’re just not that good.

BABIP, like all stats, should never be used in a vacuum. It’s a tool that can be used to tell part of the story. But it’s a really fun tool. It’s fun because you can complain that it’s the luck dragons that are cursing you when another sharp grounder finds a glove rather than a hole. And soon those luck dragons will turn the tides, lay waste to your opponents, and balance out the universe.

Ian Malinowski: The trajectory of a single pitch, consisting of horizontal release point, vertical release point, initial velocity, spin rate, spin angle, horizontal movement due to spin, and vertical movement do to spin, is not really a statistic, but even so it might be my favorite.

None of those measures mean anything on their own. They don’t tell you who won. They don’t apportion credit or blame. They predict nothing. They’re just description, in granular, malleable form. With sharp vision and photographic memories, we wouldn’t need the PITCHf/x or Statcast systems, but humans are frail, so we make tools to measure and remember the things that we can’t.

You can access all that data for yourself for free from Baseball Savant, but I think that’s a bad name. Savant implies the extraordinary. Were the early humans who fashioned spears and set fires savants? Were Greek sailors who used an astrolabe to measure their latitude savants? Of course not. They were people with tools. Nothing could be more ordinary than to extend ones abilities by manipulating tens of thousands of rows of pitch trajectory data.

Wielding powerful tools with the aim to better understand a bizarrely frivolous game is delightfully exhilarating. It might even be better than building a campfire.

Elizabeth Strom: Wins Above Replacement (WAR)

It’s not that new; it’s not an official MLB stat; there are disagreements about how to calculate it. But my favorite stat is WAR.

There are, of course, more precise metrics that can better drill down on the individual aspects of a player’s game. Other statistics are more predictive. But so often, when thinking or writing about baseball, we want to be able to conceptualize a player’s overall contribution. WAR helps us do that. It allows us to have conversations about disparate contributions of players who may be good at different things. It gives us a quick way to assess the outcome of trades.

WAR allows us to compare across positions, teams, and eras. It’s the perfect stat for the non-stats person, because it is intuitive, and it is comprehensive.

Perhaps more importantly, WAR highlights the value of players who might otherwise be overlooked because their particular skills aren’t properly appreciated. This has been the case with several Rays players. We all know that Kevin Kiermaier is a good center fielder, but WAR helps us see that his exceptional defensive play made him almost as valuable to the Rays in 2015 as Bryce Harper was to the Nationals (Kiermaier was worth 7.7 WAR that year, Harper worth 8.4).

And remember when Ben Zobrist was an unheralded utility player for small market team? The baseball world laughed the notion that he was the best player in baseball, but he led the majors in WAR from 2009-2012, and suddenly every MLB franchise was looking for the “next Ben Zobrist.”

I would never argue that WAR should be the only way you evaluate a player, but it’s a great tool for baseball fans.

John Ford: Maybe it’s because I’m old(er). But I love stats that I have an emotional attachment to.

Yes, you read that right. I have an emotional attachment to certain mathematical formulas.

I love ERA and batting average, because I remember sitting up in my bedroom during the long, hot summers before everybody had air-conditioning and computers, playing Strat-O-Matic or Status Pro Baseball, keeping track of make-believe seasons of the 1977 Phillies or 1983 Twins.

I remember the smell of humidity, and the feel of pencil against paper as I scratched out long division. The simplicity of that most perfect stat (at the time) of whether a pitcher was any good: earned runs * 9 / innings pitched.

I remember learning to do decimals reflexively, easily able to convert fractions faster than the other kids in my class simply because I had long ago understood the truth that a guy who was 3-for-7 was hitting .429.

I remember learning history by hanging them on the sacred totems.

.406

1.12

511

Listen, WAR is great at comparing across generations, and the new defensive metrics are fine for telling us who is good at what and who isn’t. But give me a number I can crunch with my own hands, with a pencil and paper and long division, and I am a kid with baseball cards again.

Jared Ward: Many times in baseball, fans and sometimes owners ask “what have you done for me lately”. Players go through “slumps” and often, as fans, we forget the overall contribution of that player to the team and to what matters most, which is scoring runs.

That’s why weighted runs created + (wRC+) is my favorite stat. It tells me how a player contributed to an offense, it adjusts for park factors, it tells me how much better (or worse) a player is than the rest of the league and it’s a single concise number that keeps me humble when I want to say “what have you done for me lately”.

Longo, even with all of his injured years, and questions about his health and return to form, only posted one year below the league average of 100 wRC+, And that was the year before the Rays traded him in 2016 where he posted a 96 wRC+.

Ian Malinowski: Every statistic that answers a question poses another one: “So what?” That’s what wRAA is for. How do you compare baserunning skill, batting ability, defense, catcher framing, and pitching? You convert them into runs. Runs, commonly expressed as wRAA, are the tissue that connect the rest of the sabermetric world. WAR is just wRAA with some debatable assumptions thrown on top. It’s wRAA that takes a series of statistical statements that would otherwise be normative and makes them positive. It’s not sexy, unless you think knowing how to win more baseball games is sexy. Maybe some people do.

Jim Turvey: Looking at baseball stats as a whole, there appear to be three general tiers of metrics: the OG statistics, the sabermetric statistics, and the Statcast statistics. This may be a bit of a prisoner-of-the-moment opinion, and one day the sabermetric and Statcast tiers may well combine into one big tier, but for now, those seem like the three main categories.

In the OG statistics tier, there is no better stat for me than the win. Yes, it is arguably the most derided in the modern era, and it is derided with good reason. A pitcher can get a win despite giving up 29 hits and 13 runs; another pitcher can not get the win despite throwing 12 perfect innings; a pitcher can get a win without even throwing a pitch! Brian Kenny has led the charge to “kill the win” for years now, and plenty of people would agree with him. Here’s the thing, though. Once you willingly admit that the win is basically a pointless stat, it becomes imminently lovable. Basically, I’m writing the script for How I Learned to Stop Worrying and Love the Win.

The same flaws outlined above become entertaining and lovable instead of infuriating and troubling. Alan Embree not throwing a pitch, yet “earning” the win — that’s hilarious!

Plus, it’s been a staple on baseball card for literally all of baseball history, and the number 511 is one of the most recognizable landmarks in the sport’s history. [Editor’s note: Jim originally wrote “512” which was both embarrassing for him and kind of disproved his point. Feel free to give him guff in the comments.] We all know it sucks as a stat — that’s part of its charm.

From the middle tier (and my winner for favorite of the three) is OPS+. The beauty of OPS+ is that it is understandable enough to explain to anyone who has even a light knowledge of baseball, but it is a good enough metric that I still use it quite regularly, especially when comparing players across generations, a vice that takes far more of my time than it should.

For the uninitiated, OPS+ is merely OPS (on-base percentage plus slugging percentage, two stats we all recognize and which many of us have memorized the formulas) set on a scale where 100 equals league average. Each point above 100 is worth one percentage point above league average, and each point below 100 is one percentage point below league average. I told you it was easy.

Where the nuance comes in is that OPS+ accounts for both the stadium and league effects that a “regular” stat like OPS misses out on. Dante Bichette may have posted an OPS of .895 in 1999, but that was actually less impressive than Cesar Cedeño’s OPS of .786 in 1978. In fact, Cedeño had the far more impressive OPS+ (126 to 102) because instead of hitting in Coors Field during the middle of the PED Era, he was hitting in the notorious Astrodome in the far less hitter-friendly `70s. It doesn’t even have to be that extreme, though.

To use an example the readers of DRB will relate to: Just last season, Steven Souza Jr. (.810) may have had a lower OPS than Nick Castellanos (.811), but thanks to a tougher slate of stadiums, his OPS+ (121) showed him to be a significantly superior hitter to Castellanos (110).

Finally, in the fancy, new Statcast tier, my relatively newfound love is xwOBA. This metric uses the launch angle and exit velocity tracking that MLB now has in use on every pitch to attempt to create a stat that basically strips out any and all luck out of the equation.

Now, the stat isn’t perfect - it tends to overrate slower hitters and underrate faster hitters among other slight flaws - but it’s one of the most fun and useful metrics on the market right now.

Danny Russell: My favorite stat is the stat of the future: Pitcher Wins.

In a world where starts go out the window, and 50-pitch relievers trade off what order they pitch in, Games Started as a metric goes out the window. All that matters is how many times through the order, and which part of the order, can be trusted.

We’ve gotten a taste of that this season, with Andrew Kittredge acting as the “Opener” for Ryan Yarbrough, who has now been credited with two wins in two appearances. Those wins will matter for the youngster as his career is evaluated in arbitration, by fan bases, by the media, and by front offices writ large.

Suddenly, just when you thought King Felix had sentenced it to death, pitcher wins may matter more than ever, and I’m here for that narrative.