Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Nevin Shapiro Vows To Bring Down Miami

MEANinglessness

Hey, so I took a few minutes earlier today to do a little personal research about the effects of variance on WAR and value in general. I chose WAR because I am lazy and it is a neatly packaged counting stat. The basic idea is that value is based on more than just the "average" production. You pay a preminum for reliability and receive a discount (sort of) for injury-prone guys, surgery guys, inconsistent players, etc. I took a look at some of the 2009 Rays pitchers and position players to see exactly how variance can affect the actual value of a player.

For this, I chose players from the 2009 squad who contributed, and used their last four years as tracking data. I only examined WAR totals for each year, calculated the variance/SD; the results may be interesting, but they're not that revealing without another piece of the puzzle. You can get the spreadsheet here but here are some of the more interesting results:

  • Zobrist had a mean of 2.0 WAR and a standard deviation of 4.5 WAR, which is an expression of his newly-found value - I will explain later.
  • James Shields had a mean of 3.7 WAR with a standard deviation of 1.2 WAR, demonstrating what a great deal Friedman has for this guy
  • The most consistent players out of the group were JP Howell, Akiiiiiiiiiiinori Iwamura, and Chad Bradford. Each of these guys had a SD of only 0.7 WAR.

How do we interpret this data? Well, the mean is not a prediction, but a description of the data. In theory, the probability of getting exactly the mean at the point is infinitesmally small or 0. However, we describe probability as "area under the curve" and so we can make predictions about ranges. I'll give you a simple example of the difference between the two. I ask "how many pounds will Prince Fielder gain this offseason?" One answer may simply be the average number of pounds gained per previous offseason, and this is a fine thing to do on the surface. But what if he gained 50 pounds one year, lost 50 pounds the next year, and gained 5 pounds last offseason? Now we have a lot of "Spreadedness" that isn't described by the mean alone, which would be only a little bit positive.

Instead, we might say: There is a 90% probability that he will gain between -10 and 15 pounds, etc. This means that we don't know "what" he's going to gain, but past results indicate that there is a 90% chance he will gain SOMEWHERE in that interval. We don't know where, but it's in there.

I chose a 75% interval, which means about 1.1503 * SD. This means that you can project the player to produce somewhere in the range of AVG - 1.1503SD, AVG + 1.1503SD. I called this product the "Upside number," meaning the amount of potential up (or down) side within that 80% interval. I'm an optimist.

Ben Zobrist had an upside number of 5.2 WAR, which is astounding. Let me give you the implications of that: Based on past performance, it would appear that he will produce with an 80% probability between -3.2 and 7.2 WAR. The uselessness of that statement should be obvious to you; it is like saying "he will be better than Julio Lugo and worse than Albert Pujols." (actually, he was more valuable last year!) We'll investigate into how to translate that into a useful number in a bit.

Carl Crawford had a mean of 4.0 WAR and an SD of 1.3 WAR, which makes him a fairly reliable producer in LF. His upside number is 1.3 * 1.1503 which is about 1.5. This means that with 80% confidence we can say that he will produce between 2.5 and 5.4 WAR next year. This is an extremely valuable feature of Crawford, which is consistency. Whether he rakes triples and steals bases or struggles at the plate, his defensive prowess has kept him a true asset.

The Boss is projected between 0 and 5.3 WAR, while Gabe Gross is projected between 0.6 and 2.6 WAR. Think about it a little bit.

Grant Balfour seems to be a bit risky, projected between -0.2 and 2.0 WAR, while JPH (sitting comfortably between 0.4 and 1.9 WAR) offers a tiny bit less upside but much more safety in return.

The problem with these ranges is that there is no directionality, meaning that they are constructed as if time is not a major factor in the progression (or regression, and definitely the diminishing) of a players skills. In other words, how much of that SD should we really attribute to time? The idea is that:

VARIANCE = VARIANCE DUE TO CHANGE IN TIME + VARIANCE DUE TO OTHER SHIT

This is not exact, but it's a nice approximation of a more difficult concept. What we do here is find the r value of the four years and (1, 2, 3, 4) representing the actual years. That correlation value is then multiplied by SD, since:

r^2 = ratio of variance explained by data

So, since SD^2 = Var, we only need to multiply r by the upside number to find a more meaningful, and quick, point prediction. Since correlation can be negatve, it gives us "direction" in some sense. For example, Ben Zobrist has a great correlation value (he is trending strongly upward over the last four years) and so his 4trend adjustment is 4.5 WAR. Adding that to his average of 2.0 WAR gives us 6.5 WAR, which is probably just a bit optimistic.

However, we do find some predictions more believable than others. For example, I have Shields at 4.7 WAR (probably a bit high) and Garza at 3.6 WAR (about right), while Kazmir is highly discounted by the system (at only 1.4 WAR because of a very bad correlation coefficient). For batters, I have Aki at 1.5 WAR and Longo at a whopping 7.8 WAR. And there enlies the problem: for guys like Longo with very limited data points, the results will be unrealistic. He is trending "perfectly" upwards right now, since there are only two points. Bartlett looks about a win too high at 3.7 WAR, while Aybar seems about right at 0.7.

After some fiddling around, I decided to make the middle interval 68% instead of 80%, which could narrow things down a bit (and remove some of the optimism!). Check out the sheet for these best (but still lazy) predictions.

This post was written by a member of the DRaysBay community and does not necessarily express the views or opinions of DRaysBay staff.

Comment 1 comment  |  0 recs  | 

Do you like this story?

Comments

Display:

It's a little misleading to look at just standard deviation to determine reliability

As you can see, relief pitchers’ standard deviations are much lower because their overall win values are lower. Additionally, hitters are more prone to having single “career years” but that doesn’t make them any less reliable. Given that the density curve for say, longo, is probably right skewed, you can’t just add and subtract standard deviations I don’t think.

by benderbrodriguez on Mar 6, 2010 7:16 PM EST reply actions  

Comments For This Post Are Closed


User Tools

Founded in 2005, DRaysBay is home to, "Progressive statistical analysis and reasoned argument."

Please read our Community Guidelines.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Small
Zobrist vs Pedroia vs Cano
Scaled_php_small
Rays Community Prospect #31 Runoff
127992041_extra_large_small
Fantasy Baseball 2012

Recent FanPosts

Scaled_php_small
Rays Community Prospect #34
Scaled_php_small
Rays Community Prospect #33
Scaled_php_small
Rays Community Prospect #32
Scaled_php_small
Rays Community Prospect #31
Scaled_php_small
Rays Community Prospect #30 (Again)
Scaled_php_small
Rays Community Prospect #30 Runoff
Small
Take A Moment To Rosterbate
Scaled_php_small
Rays Community Prospect #30

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Jeff Bagwell, Fred McGriff, The Hall of Fame, and 400 Home Runs
ESPN Chat with Matt Moore
Danny Clyburn: 1974-2012
Joe Maddon Town Hall Contest
Hickey said as of now all of the starters -- Wade Davis, Jeff Niemann,...
White Sox sign Dan Johnson
Indians acquire Canzler
Justin Ruggiano to Elect Free Agency
Dougdirt over at MinorLeagueBall compiled John Sickels' rankings with WAR values from Victor Wang's research.

Thread here.
The increasingly desperate search for offense has caused some teams to...

+ New FanShot All FanShots >

DRB Fantasy Baseball

Friends of the Site

DRB Suggestion Box

Drb4_medium


Managers

Slowsky__1__small Steve Slowinski

Dad_small Jason Collette

Brad_small BWoodrum

Price_small Erik Hahmann

Analysts

Lob-city_design_small rglass44

Untitled_small EminenceFront

Small Mulva

Rutg_uakjmedjwh9ndzd4lkll_small Imperialism32

100_1952_small MrNegative1

Steak-with-crown_small CBJones

Whelk_small Whelk

Small PGP

Scaled_php_small mr. maniac

Tampa_theatre_small jcmitchell

Me_small John Gregg