03 October 2007

Nostradamian baseball, part 1

It's October, and that means baseball playoffs!  They start today and it's time for my predictions.  So which were the best teams this year?

It's tempting just to look at teams' regular-season win-loss records, but as I pointed out in Pythagorean baseball, a team's win-loss record reflects luck in how the runs they scored were distributed among their games.  If team A beats team B 11-3 then loses to them 2-3, team B could be seen as lucky that team A's runs weren't more evenly distributed.  The sabermetrician Bill James came up with a method to predict win-loss record using only a team's runs scored and runs allowed, thus ignoring how runs are bunched among games, which in the long run is mostly luck.

But that's not the only source of luck that contributes to wins and losses.  The distribution of hits, walks and other individual plays among innings is also important and largely luck.  For example, if a team bats once through its lineup and hits three doubles and six groundball outs, will it score one or two runs in those two innings?

What is needed to factor out this kind of luck is a way to predict runs scored and allowed from more atomic statistics like hits, walks, doubles, stolen bases, etc., just as the Pythagorean approach predicts wins and losses from runs scored and allowed.  Base Runs (which can be seen as an improved version of Bill James's Runs Created) does just that.  So we can use Base Runs to estimate runs scored and allowed, then the Pythagorean approach on the results to estimate wins and losses.  This way both run-bunchings among games and play-bunchings among innings are factored out, and just as past Pythagorean win percentage predicts future win percentage better than past win percentage does, past Base Runs predicts future runs better than past runs does, so past Base-Runs-Pythagorean ("Nostradamian"?) win percentage should predict future win percentage better than past Pythagorean win percentage does.  The Nostradamian, Pythagorean and actual win percentages for all teams in 2007:

Red Sox.630370.624234.592593
Blue Jays.550323.533992.512346
Devil Rays.438791.414708.407407
White Sox.437440.413416.444444

So, based on these results, it's reasonable to predict:

Red Sox over AngelsRed Sox over YankeesRed Sox over Rockies
Yankees over Indians
Cubs over DiamondbacksRockies over Cubs
Rockies over Phillies

In fact, Nostradamian winning percentage says that the Red Sox were by far the best team in baseball this year but were actually quite unlucky.  On the other hand, the Diamondbacks deserved to have a losing record but eked into the playoffs by being the luckiest team in baseball, winning almost 14 games more than they deserved.  The unluckiest team was the Athletics, who lost almost 9 games more than they deserved.  The two best teams not to make the postseason are the Padres, who lost a one-game wild-card playoff to the Rockies in extra innings Monday night, and my Blue Jays, who were unlucky not only in play distribution but by being in the same division as the Red Sox and Yankees.

But if luck can vary so much among teams over a 162-game season, just imagine how much it varies in a 5- or 7-game postseason series.  In fact, Athletics general manager Billy Beane famously said that his sabermetric strategies don't work in the playoffs.  So after all that analysis I'm still going to go with my heart and predict:

Red Sox over AngelsRed Sox over IndiansCubs over Red Sox
Indians over Yankees
Cubs over DiamondbacksCubs over Phillies
Phillies over Rockies

Go Cubbies!


Post a Comment

<< Home