clock menu more-arrow no yes

Filed under:

Why WAR is wrong about Blake Snell

New, comments

And about starting pitchers in general

MLB: New York Yankees at Tampa Bay Rays Kim Klement-USA TODAY Sports

I don’t usually do this. Most of the time I like WAR.

Yes, there are different methods of calculating it. No, that’s not a weakness. Of course it’s not a perfect measure of a baseball player—no single number is. But it is quite obviously a good idea to take the numbers we have (ERA, FIP, K%, BB%, HR/FB, IP, etc.) and use them together in some intentional formulation, rather than just using them willy nilly to justify whatever we want them to justify. WAR can help us compare apples to apples and on the whole it is a useful and spiffy tool.

But right now, when I look at the FanGraphs WAR leaderboard, it’s not justifying what I want it to justify, and Blake Snell is my guy. And then the other day, Brian Anderson got himself in a little bit of difficulty criticizing WAR without quite knowing how to criticize it right, and Brian Anderson is my guy.

So, Blake and Brian? I got you.

WAR has a problem with starting pitchers. Walk with me down the rabbit hole.

As JT Morgan explained yesterday, WAR is a counting stat. For pitchers it combines rate stats like ERA (rWAR), FIP (fWAR), or DRA (WARP) with innings pitched, which turns those rate stats into a counting stat with a unit of “runs prevented.” Then, to keep from over-weighting the innings pitched portion, it subtracts some runs prevented from the total, to signify the hypothetical “replacement player.”

The “skill” attributed to the “replacement player” varies slightly between the different WAR calculations, and while those debates can be fascinating, this is neither the time nor the place to go into whether a team full of replacement players would win 50 or 55 games. Just understand that all formulations are trying to do about the same thing, which is to represent the type of player all teams have available to them basically all of the time, either from a quick call to their Triple-A club, or via a small trade.

For starting pitchers, this imaginary replacement player looks something like a journeyman starter, or a young fringe prospect, or an established swingman/long reliever.

Now, everyone knows that in reality, true replacement level is not the same for every team. Before they traded him to the Rays for Nathan Eovaldi, the Red Sox had Jalen Beeks stashed at Triple-A, ready to come fill a need. Beeks is pretty good. The Rays, who have had approximately all their pitchers get injured, didn’t have the luxury of keeping someone like Beeks down on the farm.

On the other hand, the Rays had a surplus of middle infielders this year, so the “replacement” for Adeinny Hechavarria was (and now is) top prospect Willy Adames, who is pretty good.

This is not a weakness of WAR, because the goal of the stat is to calculate value on average, outside of the team context. To be a true flaw, WAR would need to get something wrong regarding replacement players on average. With starting pitchers, it does.

Consider Blake Snell, currently 11th in fWAR with 139 innings pitched, and Mike Clevinger, currently 9th in fWAR with 157.2 innings pitched. Snell has a better FIP (3.28 to 3.49), so Clevinger’s advantage is all about innings. Clevinger has one more start on the season, so that’s part of the difference, but not all of it. Clevinger has pitched deeper into games.

To what end? Or more precisely, to what replacement should Clevinger’s extra innings be compared?

When you take a starter off of the roster, you replace him with another starter type. That’s the proverbial “replacement player.” But when you take a starter out of the game in the seventh inning, you replace him with a reliever, and often a high-leverage back-end one at that.

If the difference between Snell and Clevinger, is some innings pitched by Jose Alvarado, Diego Castillo, or Chaz Roe, then what value is there in Clevinger pitching those innings?

This isn’t just about the Rays and their wacky reliver usage this year. Across the league, bullpens are shouldering more innings, and preventing runs while doing it. Every team has a group of good pitchers who pitch in the last several innings of close games and mostly shut down the opposition.

Obviously it’s still desireable to pitch deep into games. An ace that dependably works deep helps his team by giving the best relievers a chance to rest, ensuring that they’re available when other lesser pitchers have to be pulled earlier. There is a value to that, expressed through bullpen chaining, but the point is that the comparison needed to estimate that value is not simple, as it is for nearly every other position.

Say a starter gets 30 starts on a season.

  • If he lasted only five innings in each, that would get him to 150 innings. Those 150 innings are probably best compared to the normal hypothetical replacement level.
  • The next 30 innings (bringing us to 180) should be compared to a middle reliever, who would otherwise pitch the sixth inning, plus the roster and arm fatigue implications of forcing that middle reliever to pitch.
  • The next 30 innings (bringing us to 210) should be compared to a seventh inning reliever (who’s pretty good!), plus roster and arm fatigue implications.
  • Anything above that, and you’re comparing to very good relievers, and the value starts to get very marginal, and perhaps negative.

I’m not sure about the actual values. This is a thought experiment. Don’t yell at me.

It takes a fantastic pitcher to sit an opposing lineup down four times, and nothing here is meant to trivialize the ability of the starters who can, but if bullpens across baseball continue to be leaned on more heavily, and if they show that they can shoulder the higher innings load, then the actual baseball value of working nearly a complete game will continue fall.

WAR is a comparison, and what you compare matters. For starting pitchers, the simple “replacement player waiting at Triple-A” just isn’t the right foil.

#SnellForCyYoung #maybe #IfChrisSaleIsInjured