In 2012, the Rays should have won 95 games according to their Pythagorean win expectancy record, which translates run differential into a team's expected wins and losses. Meanwhile, the Rays fWAR (Fangraph's WAR) total suggests that they should have won 90 games last year. Even over the course of a full season with an active roster composed of 25 players, five wins is a significant difference.
This is not something new to the Rays. In years past, they have scored more runs and allowed fewer than fWAR suggests they should have. Only the Oakland A’s have been more “cheated” by fWAR during the past three years.
The question has bugged me for some time: why have the Rays outperformed their fWAR? The question can be broadened to account for all teams. Why do teams perform better or worse than their fWAR?
First of all, it is important to understand what WAR is. WAR is a statistic that attempts to measure the value a player contributes to his team over a replacement level player. WAR accounts for offense, defense, and pitching, making it the best all-inclusive stat available to the public sphere. The total contributions of a team can be added together to form the total wins above replacement produced by the team. Given that a replacement level AL team is expected to win 44.1 games, every win (1 WAR) produced by the team adds to that total. An AL team that compiles 44 WAR should be expected to win about 88 games.
Despite the complexity of fWAR, there is a startling amount of variance between a team’s fWAR total and the amount of games they should have one based on the runs they scored and allowed. Since WAR is supposed to represent the amount of runs a team should have scored and allowed, the magnitude of the difference is shocking. As the chart above shows, teams have outpaced their WAR by as many as 13 wins over three seasons. What causes these teams to outperform their WAR by such a drastic margin? The answer lies in the way fWAR calculates a pitcher's contributions.
Despite the Rays ranking 1st, 2nd, and 4th in ERA- (park adjusted ERA) over the past three years while throwing a boatload of innings, the Rays have ranked 3rd, 9th, and 8th in Fangraphs Pitchers WAR. Unlike Baseball Reference’s WAR (rWAR, which I will discuss later), fWAR pitching component is FIP. When the difference between the FIP and the ERA is considerable, a discrepancy emerges which can account for many of those lost or added wins.
To delve into this deeper, we need to understand the basic differences between the two stats. ERA is a very simple stat, looking at the amount of earned runs a pitcher surrenders on average over the course of nine innings. FIP is quite different, ignoring runs altogether and looking at a pitcher's peripherals. It gives weights to strikeouts, walks, and home runs, then scales it to compare with ERA. The basic purpose behind FIP is to eliminate defense and luck (BABIP, LOB%, etc..) from the equation, more closely approximating what the pitcher was solely responsible for. Both have their strengths and weaknesses, and it is helpful to use both in a thorough evaluation of a pitcher.
When a pitcher (or a team for that matter) posts a lower ERA than FIP (after both have been adjusted for park), the general assumption is that defense of luck* account for the gap. When we are looking at a team's total WAR, we are also looking at their positional player's value, which includes defense. As a result, the defensive factors should be eliminated(except for shifts). All that should remain is pure luck (or skill, if you believe in clutch or a pitcher's control of their BABIP and home run rates).
*I am calling it luck for the lack of a better term. There is evidence that pitchers have some, but a limited amount, of control over their BABIP, LOB%, HR/FB%, etc...
My next step was to test the correlation between the amount a team outperforms their WAR (the totals are shown in the above chart) and the gap between their FIP- and ERA- (both are adjusted for park). Do team's that outperform their WAR tend to be the ones that outperform their FIP? After running the data for the three year time period, I found the correlation to be .54. In this context, that is very strong. Here are a few of my thoughts on this...
The strength of the correlation shows that the primary cause of team’s outperforming or under performing compared to their WAR is disparity in a team’s ERA and FIP.
- The correlation is not perfect, most likely due to rudimentary defensive metrics, the offensive dimension of WAR, defense being accounted for in WAR, and other minor factors.
- The correlation is still very strong, which shows that the factors that separate ERA and FIP outside of defense are very important. LOB%, BABIP, etc… all are very influential in the WAR/wins gap.
The other commonly used version of WAR, rWAR, uses runs average (RA) as its pitching statistic instead of FIP. Out of curiosity, I decided to see if rWAR more closely follows Pythagorean wins compared to bWAR. For the past three years, I gathered the amount of Pythagorean wins American Leagues teams were over or under their WAR. After doing this, I calculated the standard deviation for each type of WAR.
fWAR had a standard deviation of 4.4; the standard deviation for rWAR was 2.6. In other words, fWAR swayed away from Pythagorean wins far greater than rWAR.
This makes sense. Since rWAR looks at runs average while subtracting defensive contributions, it should correlate very closely to the amount of runs a team gives up. fWAR on the other hand ignores runs completely, making it much more likely to veer away from the runs allowed.
When it comes down to it, the difference between the two pitching WARs comes down to their mission. rWAR takes the finished product (runs allowed) and divides the value among the pitchers based on factors a pitcher does not completely control. fWAR builds from the bottom up, estimating the value of each pitcher based on theoretical estimations which try to remove all "impurities" found in rWAR. While rWAR’s path allows it to stay more rooted in what actually happened, fWAR valiantly attempts to quantify the value of each player based strictly on things in their control.
To answer the initial question, the Rays have won more games than expected by their fWAR because they have posted ERA’s far lower than their FIPs, even after adjusting for park. The reasons behind this are not as clear. Part of the incongruity is due to Hellickson’s mystical ability to turn balls in play to outs. The amount of shifts the Rays use is probably another. Either way, the construction of the Rays roster and the manner in which they have utilized it has allowed them to gain an edge on FIP, which is why the Rays have been better than their fWAR indicates. It will be interesting to see if that continues to be the case going forward.