So, a funny thing happened on the way to developing Game Score Deviations (GSDev). And by funny, I mean sloppy. And that leaves me with some of my own mess to clean up.
Where exactly
is the mess? It’s sneakily contained within this excerpt from the post in which
I explained the GSDev system: “Why so much focus on the exact difference
between the Game Scores for regulars and scrubs? I’ll posit that the difference
between regulars and scrubs can also serve as a good choice of deviation to use
in a stab at a confidence-based measure. However, we’ve seen that in its raw
form, this difference is prone to pretty wild annual variations, so it probably
needs to be toned down at least a bit. I’m using a rolling five-year average,
with a historically normal 8.5 added as a regression to the mean to mute the
extremes a bit more.”
This is still an accurate description of the approach GSDev will be using. But when I first adopted it, I casually chose to give the historical mean the same amount of weight as one year’s measured deviation when calculating the rolling average. Or, to put it differently, I regressed 1/6 of the way toward the mean. This was an arbitrary, poorly-considered choice. Having put some actual work into it now, the correct extent of regression to the mean appears to be 50% rather than 16.7%. As you might guess, implementing this change has some noteworthy effects.
How noteworthy are we talking about? For 1903 (the highest-deviation year for which we have all five surrounding
seasons of data), the regressed average deviation drops by about 10%, from 11.2 to
10.1 points of Game Score. For 1980 (which has the lowest deviation), it
increases by a similar fraction, from 6.7 to 7.4. As far as the GSDev scores for individual pitchers in those seasons are concerned, Christy
Mathewson’s league-leading 12.03 GSDev in 1903 becomes 12.71, a notable
improvement but still not a contender for top-100 status. Steve Carlton’s 1980,
on the other hand, drops from 21.25 GSDev to 19.74. This is a much larger change
in raw score (and slightly larger in percentage terms), and also much more
significant because Carlton’s 1980 had previously been ranked as the fifth-best starting pitching season of the last twelve decades.
The correction to the underlying GSDev numbers will require minor updates to a couple
of the other posts in this series. But the nature of the change has an outsized
effect on the evaluation of the best individual years, and as such, the previously published list of the top 100 seasons merits a complete rewrite. So here we are!
Here is the
updated list of the top 10 single-season GSDev scores from 1901-2022:
|
Rank |
Pitcher |
Year |
Starts |
Adj GS |
GSDev |
|
1 |
Pedro Martinez |
2000 |
29 |
79.1 |
23.61 |
|
2 |
Pedro Martinez |
1999 |
31 |
76.2 |
22.47 |
|
3 |
Randy Johnson |
2001 |
39 |
71.3 |
21.90 |
|
4 |
Sandy Koufax |
1965 |
44 |
69.5 |
21.25 |
|
5 |
Bob Gibson |
1968 |
37 |
72.6 |
20.65 |
|
6 |
Roger Clemens |
1997 |
34 |
72.9 |
20.55 |
|
7 |
Steve Carlton |
1980 |
42 |
65.6 |
19.74 |
|
8 |
Curt Schilling |
2001 |
41 |
67.7 |
19.73 |
|
9 |
Mike Scott |
1986 |
39 |
68.9 |
19.68 |
|
10 |
Greg Maddux |
1995 |
33 |
71.9 |
19.58 |
The changes
aren’t overwhelming, but they are notable. The top four seasons are the same,
although all of their raw scores went down by moderate amounts. Carlton’s 1980
predictably drops, but only from #5 to #7. Nine of the ten seasons on the list
are still the same, with Maddux’s ’95 jumping in to replace Ron Guidry’s 1978.
If the
biggest single drop in seasonal GSDev (which Carlton’s 1980 in fact represents)
only costs the pitcher two spots in the overall rankings, the rest of the
changes can’t be that big a deal, right? Sadly, this is not the case. The issue is that the scores at
the top end of the curve have significant gaps between them. As we go lower, they’ll group much more closely together. To demonstrate, let’s look at seasons 11 through 30:
|
Rank |
Pitcher |
Year |
Starts |
Adj GS |
GSDev |
|
11 |
Randy Johnson |
1999 |
36 |
69.2 |
19.35 |
|
12 |
Dwight Gooden |
1985 |
35 |
69.7 |
19.31 |
|
13 |
Grover Alexander |
1915 |
44 |
71.6 |
19.29 |
|
14 |
Ron Guidry |
1978 |
37 |
68.3 |
19.25 |
|
15 |
Pedro Martinez |
1997 |
31 |
72.2 |
19.20 |
|
16 |
Walter Johnson |
1913 |
36 |
74.9 |
19.05 |
|
17 |
Dazzy Vance |
1924 |
34 |
71.8 |
18.84 |
|
18 |
Steve Carlton |
1972 |
41 |
69.2 |
18.82 |
|
19 |
Randy Johnson |
1995 |
33 |
70.7 |
18.75 |
|
20 |
Randy Johnson |
2004 |
35 |
68.5 |
18.60 |
|
21 |
Lefty Grove |
1931 |
33 |
72.1 |
18.56 |
|
22 |
John Smoltz |
1996 |
40 |
67.7 |
18.50 |
|
23 |
Sandy Koufax |
1963 |
42 |
67.7 |
18.33 |
|
24 |
Hal Newhouser |
1946 |
34 |
70.1 |
18.32 |
|
25 |
Bob Feller |
1940 |
37 |
69.4 |
18.30 |
|
26 |
Sandy Koufax |
1966 |
42 |
66.0 |
18.14 |
|
27 |
Walter Johnson |
1912 |
37 |
74.0 |
18.04 |
|
28 |
Ed Walsh |
1910 |
36 |
74.3 |
17.92 |
|
29 |
Gerrit Cole |
2019 |
38 |
67.3 |
17.85 |
|
30 |
Kevin Brown |
1998 |
40 |
66.4 |
17.82 |
See what I mean? If Randy Johnson’s 1999 had dropped by the same 1.51 points that were lost by 1980 Carlton, it would’ve plummeted from #11 on the list to #29.
Regarding the actual changes in this group: Guidry drops from #9 to #14; on the other end, three seasons (numbers 27-29) join this group from below, with Walter Johnson and Ed Walsh climbing ten spots apiece to do so. Walter’s other year in this group also jumped by half a dozen positions; given how much time I’ve spent talking about how high the deviations were in the deadball era, seeing that era’s best pitcher gain ground when those deviations are moderated should not be a surprise.
Next group,
31-60:
|
Rank |
Pitcher |
Year |
Starts |
Adj GS |
GSDev |
|
31 |
Bob Gibson |
1969 |
35 |
69.1 |
17.79 |
|
32 |
Curt Schilling |
2002 |
36 |
67.0 |
17.74 |
|
33 |
Randy Johnson |
2000 |
35 |
67.1 |
17.64 |
|
34 |
Bob Feller |
1946 |
42 |
66.5 |
17.63 |
|
35 |
Greg Maddux |
1994 |
25 |
73.5 |
17.56 |
|
36 |
Tom Seaver |
1973 |
40 |
66.9 |
17.48 |
|
37 |
Randy Johnson |
2002 |
36 |
66.6 |
17.46 |
|
38 |
Lefty Gomez |
1937 |
36 |
68.6 |
17.40 |
|
39 |
Carl Hubbell |
1936 |
36 |
69.1 |
17.40 |
|
40 |
Johan Santana |
2004 |
36 |
66.2 |
17.30 |
|
41 |
Clayton Kershaw |
2015 |
35 |
66.6 |
17.29 |
|
42 |
Tom Seaver |
1971 |
35 |
68.9 |
17.28 |
|
43 |
Hal Newhouser |
1945 |
39 |
67.2 |
17.12 |
|
44 |
Roger Clemens |
1986 |
38 |
65.6 |
17.04 |
|
45 |
Greg Maddux |
1997 |
36 |
66.8 |
16.99 |
|
46 |
Juan Marichal |
1966 |
36 |
66.2 |
16.94 |
|
47 |
Zack Greinke |
2009 |
33 |
66.6 |
16.93 |
|
48 |
Mike Norris |
1980 |
33 |
65.0 |
16.92 |
|
49 |
Roger Clemens |
1987 |
36 |
65.3 |
16.86 |
|
50 |
Jake Arrieta |
2015 |
36 |
65.6 |
16.84 |
|
51 |
Dazzy Vance |
1928 |
32 |
69.8 |
16.83 |
|
52 |
Roger Clemens |
1998 |
33 |
67.4 |
16.79 |
|
53 |
JR Richard |
1979 |
38 |
63.7 |
16.76 |
|
54 |
Smoky Joe Wood |
1912 |
41 |
70.6 |
16.69 |
|
55 |
Bob Feller |
1939 |
35 |
67.0 |
16.68 |
|
56 |
Roger Clemens |
1991 |
35 |
66.3 |
16.67 |
|
57 |
Juan Marichal |
1965 |
37 |
65.7 |
16.66 |
|
58 |
Clayton Kershaw |
2013 |
37 |
64.5 |
16.61 |
|
59 |
Jacob deGrom |
2018 |
32 |
67.4 |
16.59 |
|
60 |
Vida Blue |
1971 |
40 |
66.2 |
16.58 |
Here we see a
couple of very large drops from the initial list – Mike Norris 1980 from 26 to
48, JR Richard ’79 from 30 to 53. On the other side of things, the two Seaver
years listed above had significant jumps, 8 and 11 positions respectively.
Also, as a further demonstration of the rather jumpy nature of the deviation averages (and of how tightly packed the ordinal rankings are), the correction drops Randy Johnson’s 2000 season by five positions, but moves his 2002 season up by four spots on the list. The changes in score aren’t enormous (0.16 and 0.13 GSDev, respectively), but it’s still quite the shift in league deviation for only two years having passed. (On an individual level, this also gives the Big Unit six seasons in the overall top 40, which is twice as many as any other pitcher.)
Closing out
the list, here are seasons 61-100:
|
Rank |
Pitcher |
Year |
Starts |
Adj GS |
GSDev |
|
61 |
Walter Johnson |
1915 |
39 |
69.3 |
16.53 |
|
62 |
Corey Kluber |
2017 |
31 |
66.7 |
16.51 |
|
63 |
Justin Verlander |
2011 |
38 |
64.6 |
16.49 |
|
64 |
Tim Lincecum |
2009 |
32 |
66.2 |
16.40 |
|
65 |
Roger Clemens |
1988 |
36 |
65.6 |
16.39 |
|
66 |
Bret Saberhagen |
1989 |
35 |
65.0 |
16.31 |
|
67 |
Greg Maddux |
1998 |
36 |
65.5 |
16.25 |
|
68 |
Walter Johnson |
1918 |
29 |
73.4 |
16.25 |
|
69 |
Ed Walsh |
1908 |
49 |
68.1 |
16.21 |
|
70 |
Jason Schmidt |
2003 |
30 |
67.1 |
16.18 |
|
71 |
Gaylord Perry |
1972 |
40 |
65.8 |
16.17 |
|
72 |
Dolf Luque |
1923 |
37 |
67.4 |
16.15 |
|
73 |
Justin Verlander |
2019 |
40 |
64.3 |
16.14 |
|
74 |
Johan Santana |
2006 |
35 |
64.9 |
16.14 |
|
75 |
Walter Johnson |
1910 |
42 |
69.7 |
16.13 |
|
76 |
Christy Mathewson |
1905 |
40 |
71.3 |
16.12 |
|
77 |
Jim Palmer |
1975 |
38 |
65.0 |
16.08 |
|
78 |
Robin Roberts |
1953 |
41 |
63.9 |
16.06 |
|
79 |
Bob Gibson |
1970 |
34 |
67.1 |
16.05 |
|
80 |
Josh Beckett |
2007 |
34 |
64.9 |
16.04 |
|
81 |
Roy Halladay |
2011 |
34 |
65.3 |
16.04 |
|
82 |
Tom Seaver |
1977 |
33 |
65.1 |
15.99 |
|
83 |
Lefty Grove |
1936 |
30 |
69.3 |
15.99 |
|
84 |
Dizzy Dean |
1934 |
36 |
67.2 |
15.97 |
|
85 |
Justin Verlander |
2012 |
37 |
63.8 |
15.96 |
|
86 |
Greg Maddux |
1996 |
40 |
64.1 |
15.93 |
|
87 |
Zack Greinke |
2015 |
34 |
65.0 |
15.92 |
|
88 |
Don Drysdale |
1964 |
40 |
64.1 |
15.91 |
|
89 |
Mordecai Brown |
1909 |
34 |
71.4 |
15.91 |
|
90 |
Roger Clemens |
1990 |
33 |
65.5 |
15.89 |
|
91 |
Kevin Appier |
1993 |
34 |
65.6 |
15.89 |
|
92 |
Walter Johnson |
1914 |
40 |
68.9 |
15.86 |
|
93 |
Lefty Grove |
1930 |
34 |
67.9 |
15.85 |
|
94 |
Clayton Kershaw |
2014 |
29 |
67.4 |
15.85 |
|
95 |
Bert Blyleven |
1973 |
40 |
64.6 |
15.84 |
|
96 |
Clayton Kershaw |
2016 |
25 |
68.4 |
15.84 |
|
97 |
John Tudor |
1985 |
41 |
62.9 |
15.83 |
|
98 |
Red Faber |
1921 |
39 |
67.0 |
15.82 |
|
99 |
Luis Tiant |
1968 |
32 |
67.5 |
15.80 |
|
100 |
Greg Maddux |
1992 |
35 |
64.8 |
15.74 |
So, the
obvious question here is, how much does the composition of the top 100 change? From
the initial list, we lose a grand total of five seasons: Mario Soto 1982 (which drops
from #70 to #109), Steve Rogers 1982 (84 to 122), Roy Halladay 2010 (94 to 105),
Bucky Walters 1939 (97 to 111), and Nolan Ryan 1977 (100 to 124). The beneficiaries of these exits are two Walter Johnson seasons (1910 jumps from #101 to #75, 1914 from 122
to 92), Bob Gibson 1970 (103 to 79), Bert Blyleven 1973 (109 to 95), and Red
Faber 1921 (111 to 98). Notably, if you remember the counts from the initially
posted list, Walter Johnson joins Greg Maddux and Randy Johnson with six
top-100 seasons, trailing only Roger Clemens’s seven. Gibson and Halladay were the only other pitchers in either group who had multiple seasons on the old list; Gibson now has three to Halladay
Finally, here
is the composition of the list by decade, compared once again to bWAR’s top 100
seasons over the same period and with changes from the prior list included:
|
Decade |
bWAR |
GSDev |
Change |
|
1901-09 |
18 |
3 |
0 |
|
1910-19 |
22 |
9 |
+2 |
|
1920-29 |
6 |
4 |
+1 |
|
1930-39 |
9 |
7 |
-1 |
|
1940-49 |
7 |
4 |
0 |
|
1950-59 |
1 |
1 |
0 |
|
1960-69 |
7 |
9 |
0 |
|
1970-79 |
12 |
11 |
+1 |
|
1980-89 |
5 |
9 |
-2 |
|
1990-99 |
7 |
17 |
0 |
|
2000-09 |
4 |
13 |
0 |
|
2010-19 |
2 |
13 |
-1 |
That’s a
small shift in the direction of how bWAR sees things (a couple more deadball
seasons, a pair of early ‘70s years, dropping one season from the 2010’s and
two from the ‘80s). Overall, though, GSDev still has a much stronger affinity for modern pitchers than does bWAR.
Ultimately, I
get two reminders from the results of this correction. First, ordinal rankings often get very
tightly packed very quickly, especially when the base population is very large.
There are over 1200 ace-level seasons (10-plus GSDev) in our sample to date; if
you have that many options, you run into things like a 4% drop in score (Mario Soto’s 1982 fell from 16.18 GSDev to 15.58)
knocking you from #70 to out of the top 100. Soto still had a great
season (as did Red Faber in ‘21, on the other end of things), and the differences
between them are marginal at best regardless of which one appears on this list.
And second,
think your research through carefully before you start writing up blog posts. It’s
helpful in egg-proofing your face. Since I didn’t do that this time around, the
top 100 careers list may be delayed a bit further as I both update the numbers
and rewrite my initial drafts to reflect the updated rankings. But next time
(whenever that proves to be), career numbers are coming.