Making sports statistics more scientific
Whether it is the sprinter who finished first or the team that scored more points, it's usually easy to determine who won a sporting event. But finding the statistics that explain why an athlete or team wins is more difficult -- and major figures at the intersection of sports and numbers are determined to crack this problem.
Many statistics explain part of the picture, especially in team sports, such as the number of points scored by a point guard, a quarterback's passing yards, or a slugger's batting average. But many of these numbers -- some of them sacred among sports fans -- don't directly address a player's contribution to winning. This was a primary topic of discussion last weekend at the Sloan Sports Analytics Conference in Boston.
Organized by students from the MIT Sloan School of Management and sponsored by several sports-related companies, including media outlet ESPN, the conference brought together over 2,200 people to discuss player evaluation and other factors important to the business of sports.
Many of the research presentations and panel discussions described efforts to remove subjective judgments from sports statistics -- and how to define new statistics more directly explain a player's value.
"We have huge piles of statistics now," said Bill James, Boston Red Sox official and baseball statistics pioneer, at a panel discussion about adding modern statistics to box scores. "What you have to do is reduce it to significant but small concepts," said James.
New technology and analysis is only now making it possible to learn more about many fundamental events in several sports, which are not often addressed by traditional sports statistics.
"We're going to talk about stats that work and stats that don't work," said John Walsh, executive vice president of ESPN, who moderated the box score panel discussion.
The panel, which also included three other experts, cited several examples of statistics that didn't work: a receiver might drop a pass for one of several reasons -- but rarely are drops broken down into categories; an assist in basketball is a judgment call with room for different interpretations; and fielding percentage in baseball only generally describes a defensive player's ability.
In another session, Greg Moore, the director of baseball products for the sports graphics and visualization company Sportvision, described recent data-collection advances in baseball. When all the company's systems are fully deployed in Major League Baseball stadiums, they plan to track the trajectory of each pitch thrown, the movement of all the players on the field and the speed of every swing and hit ball. Their systems, already fully installed in some ballparks, will collect over a million data points at every game. Some of this data is publicly available.
The data will make it possible to say not just that a player hit a double or that he hit a hard line drive, but that the ball left the bat at a certain speed and launch angle and a certain number of degrees from the foul line. No scout or official scorer can contaminate those kinds of measures with subjectivity. On the other hand, a string of objective data is not inherently more useful than a flawed statistic, which may contain useful wisdom.
During the box-score panel discussion, Dean Oliver, ESPN's sports analytics director, said that collecting information this way opens a new frontier.
"It's an immense amount of data, but you have to know what to do with it," said Oliver.
The winner of the conference's research paper competition found one way to make new data useful. Using SportVU, a basketball database collected by the company STATS, a team from the University of Southern California's computer science department studied rebounding a basketball from its absolute first concepts. The data shows the movement of all the players and the ball, including rebounds, passes and other game events.
The research team showed empirically what was only previously accessible from inference and experience. They were able to show that by the time almost all rebounds travel 14 feet from the hoop they also drop below eight feet of elevation -- easy reaching distance for a basketball player. The researchers were able to compare shot distance with rebound distance and to show where strategic changes might change offensive rebounding success.
Rajiv Maheswaran, the researcher who presented the paper, compared the effort to find new insights about sports to astronomy. Once you start looking at the stars, he said, you make discoveries, which lead to new hypotheses and more research.