James Shields' BABIP and inning totals over each of the last four seasons:
2006: .332 (124.2)
2007: .292 (215)
2008: .292 (215, not a misprint)
2009: .317 (219.2)
League average BABIP against is generally between .300-.305. That means that Shields' BABIP has split between ‘above average' in disallowing hits and well below average. So what should we expect moving forward? Russell Carlton found that at 1,500 balls in play, the R value reached .50. Using the equation R = BIP/(BIP+1500) we get this equation for Shields:
Essentially, that means to get a read on Shields' BABIP true talent BABIP level, we need to regress his career BABIP by nearly 39% with the team's BABIP against. Over the last three years the Rays' BABIP against is .299, .285, and .338 which weighs to a .304 BABIP. Frankly, I'm not too sold on using the 2007 figure because that's simply not representative of what to expect from this team defensively. Sub in a league average season instead and the figure drops from .304 to .296. I'll use the .296 in regression, although I can provide the .304 numbers if anyone wants them.
Upon doing that math, you'll find that his career BABIP (.306) is identical to the regressed rate (.306, rounded, mine you). Using that same technique, here are the regressed rates for the remainder of the Rays rotation:
All of this to say: Wade Davis only had 100 balls in play against. Don't worry about his hit rate. That's a ridiculous small sample size; one that had to be regressed by more than 93%. At the same time, don't expect Price to continue at such a ridiculous pace of not allowing hits. Sonnanstine may look hittable or whatever, but in reality his expected BABIP is close to Shields -- the Rays best starter. (It's also worth noting that Sonnanstine's BABIP is close to Shane Reynolds through 400 Major League innings, and lower than Brad Lidge's, who has superior stuff in a reliever's role).
Keep sample sizes and the granularity of data in mind before jumping to conclusions. Yeah, this is close to common sense, but sometimes it needs to be written and showing the math helps with the stigmatism surrounding regression.