Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Explaining Jeremy Lin's Early, Surprising Success

Understanding Small Sample Sizes, or Evaluating Talent in April

If you are like me, you probably have got a million and one questions about the Rays running through your head these days. Will Navarro, Burrell, or Upton rebound? Will Bartlett regress? Will Longoria continue his year-over-year improvement? Will Garza take his game to the next level? Will Zorilla keep mashing home runs? The season has finally begun and it's so exciting to see this team in action. The offense has been encouraging, the starting pitching staff has been downright nasty, and there have been plenty of thrilling games already. Opening Day was like Christmas, but since then we've still been getting a present every day. It's awesome.

And so, with all this pent-up enthusiasm, I find myself perusing FanGraphs, trying to find something - anything - to write about. I want to be able to answer some of those questions, in particular the question of if Burrell has anything left to offer us. The problem is, it's way too early. Way, way too early. We talk about small sample sizes frequently here on DRaysBay, so here are the points at which certain statistics become reliable:

Offense Statistics:

  • 50 PA: Swing%
  • 100 PA: Contact Rate
  • 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA
  • 200 PA: Walk Rate, Ground Ball Rate, GB/FB
  • 250 PA: Fly Ball Rate
  • 300 PA: Home Run Rate, HR/FB
  • 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate
  • 550 PA: ISO

Pitching Statistics:

  • 150 BF - K/PA, grounder rate, line drive rate
  • 200 BF - flyball rate, GB/FB
  • 500 BF - K/BB, pop up rate
  • 550 BF - BB/PA

Don't thank me; I'm just the messenger. These numbers were derived by the saberist Pizza Cutter and although his blog is now defunct, you can find them on the Saber Library website anytime you need them. Or if you'd prefer, read his original article here.

What do these numbers mean, though? It's well and good to throw them out there, but how does this help our analysis at all? Well, I'm so glad you asked!

Star-divide

In short, these numbers represent the minimum number of plate appearances or batters faced that a batter or pitcher needs before that certain statistic can be deemed indicative of their true talent level. Notice that statistics like BA and ERA don't stabilize over the course of a full season; this is a great example of why one shouldn't use those statistics to discuss a player's ability level. Also, this is a great reminder on why spring training statistics are meaningless. Not only are pitchers and batters both working on adjustments, but you're dealing with samples as small as 25 PA or 50 BF. It's so small, nothing stabilizes that quickly.

Now that games actually matter, the statistics mean slightly more than spring training stats. Of course we shouldn't be attempting to draw large, sweeping conclusions from the statistics right now, but we can begin to glean something from the numbers. We can start by looking at all the stats that stabilize quickly, even if a player hasn't quite hit the threshold yet. For hitters, I'm paying attention to their Swing%, Contact Rate, Pitches/PA, LD%, and Strikeout Rate. These numbers will give us a good idea of if a player has changed their approach at the plate and if they're making solid contact. For pitchers, I'm concerned most with K/9, GB%, LD%, and Swinging Strike Rate. I am adding Swinging Strike Rate for two reasons: one, I don't believe Pizza Cutter tested it initially, and two, swinging strikes are intimately related to strikeouts. Increase one and you should increase the other.

Also, scouting data is by far the best and more reliable information we can have with samples this small, but sadly none of us here are professional scouts. Observations will do in a pinch, though, so I hope people continue to share their impressions on batters and pitchers over these next couple of weeks. I'm not talking about stuff like, "Navarro sux," but stuff like, "Although he's been showing a more discerning eye at the plate this season, Navarro's hits are still weak. He appears lost at the plate at times and has had many bloop hits. I'm not impressed." (Note: this is mostly fictional, although it is true that I haven't been impressed with Navarro yet). Share your observations, but make them robust. Did their swing look good? Did that pitch have lots of movement, even if it was hit hard? How was that pitch sequence? Keep asking these sort of questions.

Although these methods aren't necessarily mainstream or sexy, at this point of the season they are the best way to properly evaluate talent. We'll be able to use more and more statistics as the season progresses and sample sizes grow, but for now these are our tools. For reference, batters that have been playing every day have already accumulated 30-40 PA, meaning that they're on the verge of the first cutoff, and pitchers that have thrown two starts are around 50 BF, meaning they're not. Or in other words, this dance has just barely begun.

Comment 5 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Great stuff Steve

One thing that must be looked at even in SSS regarding pitchers is the drop of velocity: A drop of more than 3 mph should be a reason of concern:

Westmoreland recently asked his son how he was feeling, and the response the father received didn't surprise him. "I'm going to be in Portland next year," Ryan said.

by radiohix on Apr 17, 2010 11:47 AM EDT reply actions  

Good point

I’d consider velocity included in “scouting data”, although it’s by far one of the most important things to look at for pitchers. Always a concern during the beginning of a season, especially for pitchers with injury concerns.

I love Casey Fossum. Now try and take me seriously.

by Steve Slowinski on Apr 17, 2010 11:54 AM EDT up reply actions  

This problem is one of the reasons why I think some sort of rolling chart would make sense

In regards to that and to sample sizes in general: what are the statistical problems with linking the end of the last years numbers with the start of this years.

As in 25 PA last year, 25 PA this year

Go Gators!!

by matthan on Apr 17, 2010 1:33 PM EDT reply actions  

I know some sabermetricians sneer at catcher ERA

But I can’t help but say that shoppach has been calling better pitch sequences

by benderbrodriguez on Apr 17, 2010 4:00 PM EDT via mobile reply actions  

Comments For This Post Are Closed


User Tools

Founded in 2005, DRaysBay is home to, "Progressive statistical analysis and reasoned argument."

Please read our Community Guidelines.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Scaled_php_small
Rays Community Prospect #31 Runoff
127992041_extra_large_small
Fantasy Baseball 2012

Recent FanPosts

Scaled_php_small
Rays Community Prospect #33
Scaled_php_small
Rays Community Prospect #32
Scaled_php_small
Rays Community Prospect #31
Scaled_php_small
Rays Community Prospect #30 (Again)
Scaled_php_small
Rays Community Prospect #30 Runoff
Small
Take A Moment To Rosterbate
Scaled_php_small
Rays Community Prospect #30
Cloudtree_small
Statistics Manager at ESPN (Job Opening)

+ New FanPost All FanPosts >

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

ESPN Chat with Matt Moore
Danny Clyburn: 1974-2012
Joe Maddon Town Hall Contest
Hickey said as of now all of the starters -- Wade Davis, Jeff Niemann,...
White Sox sign Dan Johnson
Indians acquire Canzler
Justin Ruggiano to Elect Free Agency
Dougdirt over at MinorLeagueBall compiled John Sickels' rankings with WAR values from Victor Wang's research.

Thread here.
The increasingly desperate search for offense has caused some teams to...
Zobrist wallpaper I made :]
Actual Link: http://i1207.photobucket.com/albums/bb472/lewiedesigns/Wallpapers/Zobristwallpaper.jpg

+ New FanShot All FanShots >

DRB Fantasy Baseball

Friends of the Site

DRB Suggestion Box

Drb4_medium


Managers

Slowsky__1__small Steve Slowinski

Dad_small Jason Collette

Brad_small BWoodrum

Price_small Erik Hahmann

Analysts

Lob-city_design_small rglass44

Untitled_small EminenceFront

Small Mulva

Rutg_uakjmedjwh9ndzd4lkll_small Imperialism32

100_1952_small MrNegative1

Steak-with-crown_small CBJones

Whelk_small Whelk

Small PGP

Scaled_php_small mr. maniac

Tampa_theatre_small jcmitchell

Me_small John Gregg