Who is the best hitter in baseball history?
It’s a common question, and one with a number of available
responses. You can pick the player with the highest batting average (Ty Cobb), on-base percentage (Ted
Williams), or slugging percentage (Babe Ruth). If you prefer counting stats, you can take the leader in hits (Pete Rose), runs (Rickey Henderson), RBI or total bases (Hank Aaron), or
home runs (Barry Bonds). Ruth would probably be the most common statistical
answer, but the basic (and advanced) statistics are the beginning of the discussion, not the
end. There are numerous other arguments you can make – you can compensate for park effects or military service or league strength and come up with Williams or Bonds easily
enough.
Who is the best postseason hitter in baseball history?
Or, more to the point, how do you answer that question?
Much like in the regular season, you can try looking at the postseason leaderboards. Bobby Brown,
who played semi-regularly for the Yankees in four World Series from 1947-51, is
the all-time postseason leader in batting average and on-base percentage (among players with at least 40
plate appearances); Carlos Delgado, who played 10 postseason games for the ’06
Mets, leads in slugging, narrowly edging Troy Glaus (.757-.756). If you prefer
OPS, you can go with Willie Aikens of the 1980-81 Royals and his 1.215 mark.
But the rate stats have obvious issues, most notably the
fact that all of the all-time leaders amassed between 40 and 50 postseason plate
appearances. You can counteract that problem by examining raw totals – Derek Jeter leads in runs, hits,
doubles, and total bases, Manny Ramirez in homers and Bernie Williams in RBI.
That presents another obvious issue, which is the fact that current-day
players have two additional rounds of postseason play in which to accumulate
statistics. And if you try to balance that effect by looking at totals for the
World Series only, you end up going too far in the other direction – Mickey Mantle and Yogi Berra lead in
almost everything, because they were the two best players on the best team in
an 8-team league in which the team with the best record went straight to the
Series every year without having to risk an LCS. No matter what you look at,
the sample sizes are so small and so disparate that the methods generally used
to evaluate hitters in the regular season don’t work nearly as well in the
playoffs. And as far as I’m aware, nobody has taken the trouble to develop a
metric that’s designed to handle the unusual challenges presented by the postseason.
Until now.
This method has its origins in an offhand comment made
during an NBA Finals pregame show. It was Game 4 of the 2007 Finals, in which
the San Antonio Spurs would go on to complete their sweep of the Cleveland
Cavaliers. Before the game, the sideline reporter discussed the preparations Cleveland's LeBron James was making for “the biggest game of his career.”
That struck me as a glaringly ridiculous classification. The Cavs
were behind 3-0 in the series; even if they won Game 4, they would still have to win three more
games to take the title, two of which would be on the road. For all realistic
purposes, the series had already ended. The first three games, all of which
were played when Cleveland had a much greater chance of victory in the series,
were all substantially more important than this one. Moreover, the Cavs had
just come from a six-game showdown with the Detroit Pistons in the Eastern
Conference Finals, as well as a six-game conference semi against New Jersey.
The odds of Game 4 of a 3-0 Finals being the weightiest game of the group
struck me as miniscule.
The next day, I did the requisite calculations of game
importance in a postseason series. The calculation goes as follows: find the
odds of the team emerging victorious in the series if they win the game in
question, and the odds if they lose. The difference between those figures gives you the importance of the game in terms of winning the series. For example, in
Game 7 of a best-of-seven matchup, the team that wins the game also wins the
series, so it has 100% weight. In a Game 6, if the team that leads 3-2 wins the
game, they also win the series; if not, they still have one more chance at it
in Game 7, which gives them a 50% shot. The difference in the two probabilities
is 50%, which is also the importance of Game 6. The calculations for earlier
games in the series can either be done by working backward from the ones for
the later games, or by using the binomial distribution, which is readily
available in Excel and similar programs.
The weights of the games in every possible state of a best-of-7
series are as follows, in descending order (assuming the two teams each have
50% chances in each game):
Game
|
Series state
|
Importance
|
Game 7
|
3-3
|
100%
|
Game 6
|
3-2
|
50%
|
Game 5
|
2-2
|
50%
|
Game 4
|
2-1
|
37.5%
|
Game 3
|
1-1
|
37.5%
|
Game 2
|
1-0
|
31.25%
|
Game 1
|
0-0
|
31.25%
|
Game 5
|
3-1
|
25%
|
Game 3
|
2-0
|
25%
|
Game 4
|
3-0
|
12.5%
|
Note that since the chances of the two teams winning the series have to change by the same amount (in opposite directions) in each game, the weights are the same for both participants.
It is quite simple to run the same calculations for
best-of-5 series (which are effectively best-of-7 series in which the teams are
spotted a game each); it is only slightly more difficult to perform them for
best-of-9 series (the format used in the World Series in 1903 and 1919-21). It
is also rather basic to take a series-based calculation and turn it into a
championship-based one; each round further from the championship results in
the weights being cut in half.
The next step in using this to evaluate postseason performance
is finding an effective way to look at a player’s performance in a particular
playoff game. The one that strikes me as most naturally applicable is Win
Probability Added, which measures the player’s impact on his team’s chances in
the game at the particular moments in which he bats or pitches. If you multiply
WPA by the importance of the game toward winning the championship, it gives
Championship Probability Added, which is the measurement I’m presenting here.
There are issues with WPA that are inherently transferred to
CPA – most notably the fact that it gives events different weights based on
when they occur. If a team gets a 2-run homer in the first inning and wins the
game by one run, that homer is just as important to the final outcome as a
2-run homer in the ninth that also wins the game by one run. And yet, of the
two 2-run homers hit by the Dodgers in Game 1 of the 1988 World Series, Mickey
Hatcher’s first-inning shot gets considerably less publicity (and WPA) than Kirk Gibson’s
walkoff. There are a number of reasons for the second homer's legendary status, Gibson’s injury and the fact
that future Hall of Famer Dennis Eckersley was on the mound for his homer among
them. But a huge factor in the way that game is seen in retrospect is the fact that
Gibson’s homer turned looming defeat into stunning victory, while Hatcher’s simply gave
the Dodgers an early lead.
This sort of timing-based evaluation may not be appropriate
for the regular season – but I think it actually fits the postseason quite
well. The series are so short that evaluating the participating hitters based
on the 20 plate appearances they’ll scratch together gives us no useful
information about how good they are. The narrative of the series, often built
on Gibsonesque moments, tends to endure more than the statistics do – which is
the exact opposite of how the regular season is typically handled. (For example, more people
could quote you Babe Ruth’s regular season home run total than could describe a
particular regular-season homer that he hit; on the other hand, Ruth’s Called
Shot in the 1932 World Series is significantly better-known than his postseason
batting totals.) WPA, since it captures the impact of events as they happen, is
in essence a narrative measure, making it well-suited for a narrative-heavy
enterprise like postseason evaluation.
Having calculated the championship importance of every postseason
game in baseball since the first World Series in 1903 and borrowed the
readily-available WPA numbers for those games from the always-exemplary
Baseball Reference, I am thus able to present a list of the top 10 postseason
hitters ever, ranked by Championship Probability Added:
Mickey Mantle
|
0.826
|
Pete Rose
|
0.785
|
Lance Berkman
|
0.750
|
David Freese
|
0.743
|
Lou Gehrig
|
0.668
|
Hal Smith
|
0.655
|
David Ortiz
|
0.637
|
Reggie Jackson
|
0.617
|
Dwight Evans
|
0.584
|
Yogi Berra
|
0.580
|
The list doesn't appear to have any obvious biases. It features a combination of players from all
eras (the top 3 is composed of one World-Series-only player, one
LCS-and-World-Series player, and one player from the current three-round
format), and players who excelled through numerous appearances (Mantle, Ortiz, Berra)
or in just a few (Freese, Smith, Evans).
Here are the 10 highest-ranked
pitchers:
Mariano Rivera
|
1.870
|
Rollie Fingers
|
1.207
|
Jack Morris
|
1.027
|
Art Nehf
|
0.979
|
Allie Reynolds
|
0.859
|
Curt Schilling
|
0.849
|
John Smoltz
|
0.844
|
Herb Pennock
|
0.841
|
Bob Gibson
|
0.821
|
Sandy Koufax
|
0.818
|
Note that the pitching scores are substantially higher than
the hitting ones; that’s a common postseason theme, as teams can significantly
increase the usage of their best pitchers in the playoffs, while their best
hitters were already being used about as much as possible. That depresses the
run environment in the playoffs, and I believe the WPA numbers are calibrated to regular season scoring averages.
These lists will be both extended and discussed in far more
exhaustive detail in the future. For now, though, they’re available as a
starting point for discussion – which is the main purpose of a system like this
anyway.
No comments:
Post a Comment