Sunday, January 26, 2014

Ranking tennis players: Method (part 1)

My last post concluded with the statement that I would be presenting a tennis ranking system based on the answers to two questions: How did you play, and who did you play? Before building such a system, however, it is first necessary to decide on the meanings of the questions themselves. I’ll tackle the queries in order.

First, how did you play? This inquiry could be interpreted in a variety of ways. Did you win the match? How many sets did you win and lose? How many service games? How often did you hold serve, or break serve? How many points did you win on your own serve, or your opponent’s?

My general inclination is to answer a question with as much information as possible. So, rather than going with simply “yes, the player won his match,” or “the player won his match 2 sets to 1,” it is more useful to say, “the player won his match 6-2, 6-7, 6-1.”

The most granular information that one could expect to find readily available would be points won and lost, preferably split up between serving and returning. And indeed, this type of information is accessible on the ATP’swebsite (which is generally very good, and is the main source I'm using for my tennis work) for numerous matches, and I have collected the available point data for the last five years’ worth of Grand Slams and ATP World Tour events. But the method I’m presenting here will be based on the more basic option of service games won and lost rather than serve and return points, for three reasons.

First, the data are more reliable. In examining the point-based statistics for each match, I have noticed cases in which the results as described on the ATP website are impossible – for instance, a player who won his match in straight sets despite having broken serve fewer times than his opponent. The simpler “6-2, 6-7, 6-1” form of match evaluation is tracked on the scoreboard throughout the match rather than requiring specifically dedicated point-by-point record keeping, and thus is more likely to be correct.

Second, the simpler data are available further back into history and for a wider selection of events. Even as recently as 2009, there are a few matches for which point-based information is not present on the ATP’s site. As you go back further (which I intend to continue doing over time), the more granular information disappears entirely. A service game-based evaluation method should be equally applicable to 2013 and 1983, thus (eventually) providing a means of comparison between current stars and past greats. Also, the ATP website does not include point-based information on matches from the Davis Cup, and the Davis Cup website also does not feature such information in readily usable form. Since many top players participate in the Davis Cup on an annual basis, using the simpler description of the match allows the inclusion of an important part of their performance.

Third, the data are frankly much quicker and easier to gather. As mentioned, I have entered point data for the ATP World Tour matches from the last five years. I have also entered data from the lower-level Challenger Tour from those five years, allowing me to better evaluate the second-tier players on the circuit. There are about 150 Challenger events played every year; entering the game-based data only for a 32-man event takes roughly 5-7 minutes, while entering point-based data takes between 20 minutes and half an hour. Multiply that time difference by 150, and hopefully my motivation for entering the Challenger matches in games-only form becomes clear.

So my primary ranking method will take a player who wins a match 6-2, 6-7, 6-1 and evaluate his performance by saying, “He won 18 service games and lost 10.”

On to the second question: who did you play? This question seems simple enough to answer; just look at the person on the other side of the net. But there is a significant issue with that, because the abilities of the opposing player are often heavily dependent on the surface under his feet. Beating Andy Murray on clay (a surface on which he’s never made it to a final in any tournament) is not necessarily easy, but it is relatively manageable; beating him on grass (on which he’s won five titles, including a Grand Slam and an Olympic gold medal) is another matter entirely.

At the moment, I do not intend to adjust for the surface on which matches are played. There would be significant sample size concerns with any such adjustment, particularly on less common surfaces such as grass, or the now-rarely-seen carpet. A large number of players will have only one match in a given year on grass (the first round at Wimbledon), making it impossible to assess their ability level on that surface with any accuracy whatsoever. (In a particularly vexing case, Rafael Nadal lost his first-round match at Wimbledon in 2013 to Steve Darcis, and Darcis promptly withdrew from the event before his next match. That leaves us with absolutely no way to compare either player to anyone but each other on grass without stretching our frame of reference across multiple years, which leads to any number of other issues.) The difficulty of building a surface adjustment into the rankings would also be significantly more than trivial. And the advantages would be moderate at best - while it would provide some benefit to correct for the difference between Murray's performance on grass and clay, those differences are likely to be quite small when factored into the draw adjustment for a player with 40-plus matches played in a year, and the odds of having a draw that's made significantly more favorable than it appears by opponent-surface confluence seem relatively remote, especially because players tend to spend as much time as possible on the surfaces on which they perform best.

Similarly, I will not be compensating for other factors that may influence the quality of the opponent's play, such as injuries; I have neither the time nor the informational resources necessary to make such adjustments in anything approaching systematic fashion.

There is also the question of which matches will be considered. The invaluable Results Archive section of the ATP website includes the main draws from every Grand Slam, World Tour, Challenger Tour, and ITF Futures Tour event of the last decade-plus (quite a bit farther back than that for the higher tours). If you examine the playing activity logs for individual players, you can also find their performance in the Davis Cup and qualifying matches throughout their careers, and the Davis Cup website also has the results for its own event available for numerous years in the past.

These rankings will include performance in the main draws of Grand Slams, ATP World Tour events, and ATP Challenger Tour events, along with the highest-level bracket of the Davis Cup (the World Group). They will not include the Futures Tour, due to the prodigious amount of time that would be required to record the hundreds of Futures events played every year, the massive increase in the player list that would severely overtax my already-strained spreadsheets, and the generally low ranking of the players who participate in those events. They also will not include qualifying matches for any event, because past results from these matches are not available anywhere that I am aware of except in the performance logs for individual players, and I am not inclined to take the time necessary to scour those logs for the requisite information.

So in looking at, say, a 6-2, 6-7, 6-1 win over Denis Istomin (the current #1 Uzbek player) on a clay court, the system that will be presented here would say, “The player under consideration won 18 games and lost 10 while facing Denis Istomin on a court that is exactly like any other tennis court. Unless the match occurred in qualifying, in which case I have no idea what you're talking about.” The limitations that will result are relatively obvious, and should be kept in mind moving forward.

Next on the agenda will be the process of taking the results from a player’s matches over the course of a full season and turning them into a cohesive and (hopefully) sensible rating.

No comments:

Post a Comment