December 15, 2009 weblog
New Algorithm Ranks Sports Teams like Google's PageRank
"A few years ago, I got interested in the technology behind Google's search engine," Feng told PhysOrg.com. "After looking into the PageRank algorithm, I realized it had the same origin in random processes that I use in my research every day... I thought suitable modifications of the algorithm should apply to ranking sports teams."
As Feng explains, the Power Rank algorithm requires only two parameters: the game score and the home field advantage (in the NFL, the home field advantage is considered to be three points). "There is no human bias, no memory of last season, and no style points," he said.
The idea is that this neutral ranking method can provide more accurate results than other methods, such as the simple win-loss record. For example, the Power Rank method ranks New England in third place in the NFL (behind New Orleans and Indianapolis); while these top two teams both have 12-0 records, New England's record is 7-5. In contrast, Jacksonville, which also has a 7-5 record, is ranked 20th of 32 teams.
To determine these rankings, the Power Rank algorithm is based on the concept of a network, where the nodes are the teams and the links between nodes are defined by games played between those teams. When two teams play each other, a number is assigned to the link (arrow) between those teams, based on the score and location of the game. The number is an estimate of the probability that the team at the head of the arrow beats the team at the tail on a future neutral site. A larger margin of victory implies a higher number on the edge from loser to winner. (There are two arrows between each pair of teams, with the arrows pointing in opposite directions, and the numbers on each arrow add up to 1.00, or 100%).
Then, based on all the games in the league, each team receives a value. The teams are ranked in order by their values, similar to how Google's PageRank algorithm ranks webpages. For example, in PageRank, a website becomes highly ranked when other highly ranked websites link to it; similarly, teams earn their rank by beating other highly ranked teams. The average value of the league is zero, so that teams with a positive (negative) value are slightly better (worse) than the average team.
On the Power Rank website, Feng gives some examples to demonstrate how the algorithm works. In the NFL, suppose that San Francisco beats Chicago 55-21, Chicago beats New York 23-20, and New York beats San Francisco 35-32, with each team winning at home. If an algorithm only considered wins and losses, the three teams would be ranked equally since they each have one win.
However, the score of each game gives additional information, which the Power Rank algorithm uses. The results of the algorithm are shown in the figure above. With this basic information, the Power Rank algorithm has attempted to sort out the contradiction that San Francisco beat Chicago but also lost to New York, who Chicago beat. Of course, the team's values are based on only two games, so the values are probably not very accurate until more games are played.
On the Power Rank website, Feng currently has a list of all 120 NCAA college football teams, as well as a list of all 32 NFL teams ranked by the algorithm. So far, he has not compared the algorithm's predictions to individual game results, but he plans to do so soon. In addition, Feng plans to apply the algorithm to international soccer before next year's World Cup.
Although it may be tempting to use the Power Rank algorithm to predict individual games, Feng explains that the purpose of the algorithm is primarily to rank teams, not for guessing the future.
"It is impossible to predict the outcome of a single game," Feng said. "There are so many arbitrary factors to take into consideration, such as injuries. The Power Rank is truly about ranking teams based on how they have performed in the past. The ability to predict future outcomes based only on past scores serves to validate the rankings."
© 2009 PhysOrg.com