Putting together a top prospect list is, I imagine, no small endeavor. Much like in any other ranking of athletes, there are innumerable factors to consider. How well does the player hit, run, and field? How is his strike zone discipline, or power, or contact ability? Is there a particular type of pitch or pitcher that he struggles with? If he’s a pitcher, is he durable enough to stick in the rotation? How are his secondary pitches? His control and command? How old is the player, and what level has he reached in the minors, and how has his production changed as he’s climbed?
Once you consider all of those factors and more for a thousand or so
players and combine them into a single list, a healthy majority of fans will care
about exactly one thing: Where did you rank my team’s guy(s)?
So, for today, we’re going to consider that factor only.
Absent any other information about the player, what does the raw ranking mean?
Last time, we looked at the list of the 23 #1 overall
prospects from 1990-2012, as chosen by Baseball America. They ran the gamut
from superstar to bust, but the overall results had an average of 36.9 WAR. How
does that compare to other ranks? Here are the 10 highest-WAR ranking positions
over our sample:
Rank |
Avg WAR |
1 |
36.9 |
3 |
27.4 |
7 |
27.2 |
2 |
27.0 |
14 |
24.0 |
10 |
23.5 |
13 |
22.8 |
4 |
22.4 |
12 |
20.0 |
11 |
19.5 |
So #1 is ahead by a lot, and all of the highest-scoring ranking positions are in the top 15. But still, based on their performance in the sample, it sure looks like you’d rather be ranked
10-14 than 5-9, which is rather counterintuitive. The rankings get even stranger as you go further down. #42 prospects
have an average of 17.7 WAR, coming in solidly ahead of both #5 and #8. (#42 prospects
in the sample include Albert Pujols, Larry Walker, Nolan Arenado, and Adam
Wainwright – not too shabby!)
Well, that’s the average, a measure prone to being inflated
by outliers. How about the median?
Rank |
Median WAR |
1 |
35.7 |
3 |
20.2 |
14 |
19.9 |
7 |
19.3 |
2 |
18.9 |
12 |
18.3 |
11 |
17.3 |
10 |
17.1 |
5 |
15.2 |
13 |
14.5 |
Yes, #14 prospects actually moved UP two spots via the more
stable measure. The 14s are comparatively light on major stars for this
neighborhood (“only” Carlos Beltran, Zack Greinke, and Manny Machado), but include a
remarkable number of steady players. And while the top-heavy #42 prospects now
fall behind #8, #41 jumps ahead of both of them.
Ultimately, no matter what statistical tool you use, we’re
still looking at a sample size of 23; some amount of noise is inevitable. Even attempting an amateurish smoothing function yielded only moderate
success; it still looks like you’d rather have #14 than #5, and #30 over #20. So
what are we to do? Well, if sample size is the concern… increase the sample
size.
If you read through my recent series on ranking the top 100
players at each position, you’ll hopefully remember that as you move further
down such a list, the gaps between players become increasingly narrow. The gap
from #10 to #20 may be noteworthy; the gap from #80 to #90 is barely there. If
we assume the same should be true of prospect rankings over a theoretical large
sample, it allows us to group nearby prospects together, and increase the size of the groups as we move down the rankings. So let’s try that and
see how things look:
Group |
Avg WAR |
Med WAR |
1 |
36.9 |
35.7 |
2-5 |
23.4 |
16.7 |
6-10 |
20.6 |
12.9 |
11-15 |
20.7 |
16.5 |
16-25 |
11.8 |
5.7 |
26-50 |
11.3 |
4.9 |
51-75 |
8.2 |
1.6 |
76-100 |
7.0 |
0.8 |
That’s more like it! Numbers descending (fairly) steadily
from group to group, with comparatively little unexpected bouncing around. The
medians can be read pretty straightforwardly as follows: #1, likely star;
#2-15, likely solid player, #16-50, likely mediocre player, #51-100, likely
inconsequential.
We could, in theory, stop there – but the median isn’t the
whole story. As noted above, #42 alone produced four major stars in 23 years;
clearly not everyone ranked between 26 and 50 will post between 5 and 12 WAR before
moving on with their lives. So let’s break it down further by percentiles to
get a fuller sense of the odds.
Percentile |
#1 |
#2-5 |
#6-10 |
#11-15 |
#16-25 |
#26-50 |
#51-75 |
#76-100 |
90 |
76.3 |
58.5 |
57.2 |
50.8 |
31.8 |
34.5 |
27.0 |
21.9 |
80 |
56.7 |
39.3 |
37.9 |
36.8 |
20.3 |
19.9 |
15.9 |
12.5 |
70 |
51.1 |
32.3 |
26.8 |
28.2 |
15.1 |
12.6 |
9.2 |
7.3 |
60 |
47.2 |
20.7 |
17.7 |
21.5 |
9.7 |
8.4 |
4.8 |
2.9 |
50 |
35.7 |
16.7 |
12.9 |
16.5 |
5.7 |
4.9 |
1.6 |
0.8 |
40 |
23.5 |
10.6 |
8.6 |
8.7 |
2.6 |
1.8 |
0.2 |
0.0 |
30 |
14.7 |
7.4 |
5.6 |
6.5 |
0.2 |
0.1 |
0.0 |
0.0 |
20 |
9.0 |
3.2 |
2.7 |
1.1 |
-0.1 |
-0.2 |
-0.3 |
-0.5 |
10 |
1.3 |
0.1 |
0.0 |
0.0 |
-1.2 |
-0.9 |
-1.1 |
-1.1 |
I like this table quite a lot, frankly. The groups break
down remarkably cleanly: #1 (likely star), #2-15 (likely good player,
reasonable hope of stardom), #16-50 (likely usable, reasonable hope of good
player), #51-100 (likely barely a major leaguer, reasonable hope of usable).
Even with all of the caveats dealing with the age of the sample, the use of
only one source for the rankings, and the vagaries of bucketing as a technique
(#16 should have a lot more in common with #15 than with #50), I think this is
a usable guide for a fan trying to figure out how much confidence they should
have in their team’s shiny new hope, whether BA says he's #8, #28, or #98.
But of course, we’re not going to stop there. Up next, we’ll
plan to take a crack at the classic question: Is there such a thing as a
pitching prospect?
No comments:
Post a Comment