16 June 2007

Pythagorean baseball

Luck plays a bigger part in baseball than most fans assume.  While long-term performance can usually be predicted fairly accurately from past performance (if appropriate corrections are made to account for changing circumstances such as park, player age and level of competition), short-term results depend mostly on chance.

One way a team can get lucky is to score and allow runs at the right times, winning by a little and losing by a lot.  Bill James discovered that, in the long run, a team's winning percentage can be predicted very accurately by looking only at the number of runs it's scored and allowed.  For example, last year the Cleveland Indians scored 870 runs and allowed 782.  Intuitively, one might think that they would have been expected to win 870 / (870 + 782) = 52.6634% of their games.  But James found that the effect of the run differential is stronger than that, and the expected winning percentage is closer to 8702 / (8702 + 7822) = 55.3118%, giving a "Pythagorean" win-loss record of about 90-72.  (Actually, the best exponent seems to be around 1.83 rather than 2, but 2 is a good approximation.  Interestingly, different exponents work best for different sports like football (2.85) and basketball (16.5).)

But Cleveland's 2006 record wasn't 90-72 but only 78-84, good enough for fourth place in the AL Central.  More than any other team that year, they tended to win by a lot and lose by a little, wasting some of their run production and prevention.  By this measure, they were about 12 games unlucky.  The 2006 New York Mets also had a Pythagorean win-loss record of 90-72, but they ended up at 97-65 and won the NL East by 12 games, 7 of them by luck.

So how would this season be different if this kind of luck were factored out?  The Athletics, Padres, Tigers and Cubs would be leading their divisions.  Boston's lead would be down to 1½ games.  And the worst record in the majors would belong to the world champion St. Louis Cardinals, who lost to Oakland last night 14-3 but have won enough close games to be ahead of three teams in the NL Central.

Note that this measure only expresses luck due to run distribution among a team's games.  It doesn't capture the luck due to hit/walk distribution within a game, which can be substantial.  (Think of bunching five singles in one inning as opposed to spreading them out among five innings.)  But that's a topic for another time.


Post a Comment

<< Home