May 10, 2017

Why big-data analysis of police activity is inherently biased

by William Isaac And Andi Dixon, The Conversation

In early 2017, Chicago Mayor Rahm Emanuel announced a new initiative in the city's ongoing battle with violent crime. The most common solutions to this sort of problem involve hiring more police officers or working more closely with community members. But Emanuel declared that the Chicago Police Department would expand its use of software, enabling what is called "predictive policing," particularly in neighborhoods on the city's south side.

The Chicago police will use data and computer analysis to identify neighborhoods that are more likely to experience violent crime, assigning additional police patrols in those areas. In addition, the software will identify individual people who are expected to become – but have yet to be – victims or perpetrators of violent crimes. Officers may even be assigned to visit those people to warn them against committing a violent crime.

Any attempt to curb the alarming rate of homicides in Chicago is laudable. But the city's new effort seems to ignore evidence, including recent research from members of our policing study team at the Human Rights Data Analysis Group, that predictive policing tools reinforce, rather than reimagine, existing police practices. Their expanded use could lead to further targeting of communities or people of color.

Working with available data

At its core, any predictive model or algorithm is a combination of data and a statistical process that seeks to identify patterns in the numbers. This can include looking at police data in hopes of learning about crime trends or recidivism. But a useful outcome depends not only on good mathematical analysis: It also needs good data. That's where predictive policing often falls short.

Machine-learning algorithms learn to make predictions by analyzing patterns in an initial training data set and then look for similar patterns in new data as they come in. If they learn the wrong signals from the data, the subsequent analysis will be lacking.

This happened with a Google initiative called "Flu Trends," which was launched in 2008 in hopes of using information about people's online searches to spot disease outbreaks. Google's systems would monitor users' searches and identify locations where many people were researching various flu symptoms. In those places, the program would alert public health authorities that more people were about to come down with the flu.

But the project failed to account for the potential for periodic changes in Google's own search algorithm. In an early 2012 update, Google modified its search tool to suggest a diagnosis when users searched for terms like "cough" or "fever." On its own, this change increased the number of searches for flu-related terms. But Google Flu Trends interpreted the data as predicting a flu outbreak twice as big as federal public health officials expected and far larger than what actually happened.

Criminal justice data are biased

The failure of the Google Flu Trends system was a result of one kind of flawed data – information biased by factors other than what was being measured. It's much harder to identify bias in criminal justice prediction models. In part, this is because police data aren't collected uniformly, and in part it's because what data police track reflect longstanding institutional biases along income, race and gender lines.

While police data often are described as representing "crime," that's not quite accurate. Crime itself is a largely hidden social phenomenon that happens anywhere a person violates a law. What are called "crime data" usually tabulate specific events that aren't necessarily lawbreaking – like a 911 call – or that are influenced by existing police priorities, like arrests of people suspected of particular types of crime, or reports of incidents seen when patrolling a particular neighborhood.

Neighborhoods with lots of police calls aren't necessarily the same places the most crime is happening. They are, rather, where the most police attention is – though where that attention focuses can often be biased by gender and racial factors.

It's not possible to remove the bias

Some researchers have argued that machine learning algorithms can address systemic biases by designing "neutral" models that don't take into account sensitive variables like race or gender. But while it may seem possible in hypothetical situations, it doesn't appear to be the case in real life.

Our recent study, by Human Rights Data Analysis Group's Kristian Lum and William Isaac, found that predictive policing vendor PredPol's purportedly race-neutral algorithm targeted black neighborhoods at roughly twice the rate of white neighborhoods when trained on historical drug crime data from Oakland, California. We found similar results when analyzing the data by income group, with low-income communities targeted at disproportionately higher rates compared to high-income neighborhoods.

But estimates – created from public health surveys and population models – suggest illicit drug use in Oakland is roughly equal across racial and income groups. If the algorithm were truly race-neutral, it would spread drug-fighting police attention evenly across the city.

Similar evidence of racial bias was found by ProPublica's investigative reporters when they looked at COMPAS, an algorithm predicting a person's risk of committing a crime, used in bail and sentencing decisions in Broward County, Florida, and elsewhere around the country. These systems learn only what they are presented with; if those data are biased, their learning can't help but be biased too.

Fixing this problem is not a matter of just doing more advanced mathematical or statistical calculations. Rather, it will require rethinking how police agencies collect and analyze data, and how they train their staff to use data on the job.

Understanding the biases to improve the data

Using predictive analytics in the real world is challenging, particularly when trying to craft government policies to minimize harm to vulnerable populations. We do not believe that police departments should stop using analytics or data-driven approaches to reducing crime. Rather, police should work to understand the biases and limitations inherent in their data.

In our view, police departments – and all agencies that use predictive algorithms – should make their systems transparent to public scrutiny. This should start with community members and police departments discussing policing priorities and measures of police performance. That way any software the police use can be programmed to reflect the community's values and concerns.

Ensuring transparency

It is not enough to claim or assume an algorithm is unbiased just because it is computerized and uses data: A lack of bias must be proven by evaluating the algorithm's performance itself. Police agencies should get independent experts or human rights groups to perform regular audits of the algorithms and the data they process. Much like the annual financial reviews large companies do, these examinations can ensure the input data are valid and are analyzed properly to avoid discrimination. If a company wants to claim its algorithm is proprietary and should be kept secret, it should still be required to offer robust testing environments so outside experts can examine its performance.

Further, police departments that use algorithms to make predictions about individuals, like Chicago's Strategic Subject List does, should have policies similar to a new European Union regulation requiring human-understandable explanations of computer algorithms' decisions. And no agency or company should be allowed to discriminate against people who have been identified by predictive policing.

Used correctly, predictive policing can be used to address the complex factors underlying crime trends. For example, rather than stepping up patrols, Toronto and other cities in Canada are using predictive modeling to connect residents to local social services. By improving the quality of data cities collect, and analyzing the information with more transparent and inclusive processes, cities can build safer communities, rather than cracking down harder on areas that are already struggling.

Provided by The Conversation

This article was originally published on The Conversation. Read the original article.

Citation: Why big-data analysis of police activity is inherently biased (2017, May 10) retrieved 30 June 2024 from https://phys.org/news/2017-05-big-data-analysis-police-inherently-biased.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Experts to create predictive tool to tackle hate crime in Los Angeles

4 shares

Feedback to editors

The Milky Way's eROSITA bubbles are large and distant

15 hours ago

Saturday Citations: Armadillos are everywhere; Neanderthals still surprising anthropologists; kids are egalitarian

16 hours ago

NASA astronauts will stay at the space station longer for more troubleshooting of Boeing capsule

19 hours ago

The beginnings of fashion: Paleolithic eyed needles and the evolution of dress

Jun 28, 2024

Analysis of NASA InSight data suggests Mars hit by meteoroids more often than thought

Jun 28, 2024

New computational microscopy technique provides more direct route to crisp images

Jun 28, 2024

A harmless asteroid will whiz past Earth Saturday. Here's how to spot it

Jun 28, 2024

Tiny bright objects discovered at dawn of universe baffle scientists

Jun 28, 2024

New method for generating monochromatic light in storage rings

Jun 28, 2024

Soft, stretchy electrode simulates touch sensations using electrical signals

Jun 28, 2024

Load comments (1)

Why big-data analysis of police activity is inherently biased

Working with available data

Criminal justice data are biased

It's not possible to remove the bias

Understanding the biases to improve the data

Ensuring transparency

The Milky Way's eROSITA bubbles are large and distant

Saturday Citations: Armadillos are everywhere; Neanderthals still surprising anthropologists; kids are egalitarian

NASA astronauts will stay at the space station longer for more troubleshooting of Boeing capsule

The beginnings of fashion: Paleolithic eyed needles and the evolution of dress

Analysis of NASA InSight data suggests Mars hit by meteoroids more often than thought

New computational microscopy technique provides more direct route to crisp images

A harmless asteroid will whiz past Earth Saturday. Here's how to spot it

Tiny bright objects discovered at dawn of universe baffle scientists

New method for generating monochromatic light in storage rings

Soft, stretchy electrode simulates touch sensations using electrical signals

Relevant PhysicsForums posts

Who is your favorite Jazz musician and what is your favorite song?

Today's Fusion Music: T Square, Cassiopeia, Rei & Kanade Sato

Biographies, history, personal accounts

Cover songs versus the original track, which ones are better?

The Balinese Alphabet

History of Railroad Safety - Spotlight on current derailments

Experts to create predictive tool to tackle hate crime in Los Angeles

The promise and perils of predictive policing based on big data

Chicago tries to learn from New York crime fighting success

A new, cheap and fast IT system predicts crimes better organizes police shifts

Predictive policing substantially reduces crime in Los Angeles during months-long test

The 'Ferguson effect' or too many guns? Exploring the rise in violent crime in Chicago

The beginnings of fashion: Paleolithic eyed needles and the evolution of dress

Study finds motivation to compete is stronger with in-group members than with outsiders

We date and marry people who are attractive as we are, new analysis finds

Lie-detection AI could provoke people into making careless accusations, researchers warn

Sharing false political information on social media may be associated with positive schizotypy, research suggests

Behavioral and computational study shows that social preferences can be inferred from decision speed alone

Medical Xpress

Tech Xplore

Science X

Why big-data analysis of police activity is inherently biased

Working with available data

Criminal justice data are biased

It's not possible to remove the bias

Understanding the biases to improve the data

Ensuring transparency

The Milky Way's eROSITA bubbles are large and distant

Saturday Citations: Armadillos are everywhere; Neanderthals still surprising anthropologists; kids are egalitarian

NASA astronauts will stay at the space station longer for more troubleshooting of Boeing capsule

The beginnings of fashion: Paleolithic eyed needles and the evolution of dress

Analysis of NASA InSight data suggests Mars hit by meteoroids more often than thought

New computational microscopy technique provides more direct route to crisp images

A harmless asteroid will whiz past Earth Saturday. Here's how to spot it

Tiny bright objects discovered at dawn of universe baffle scientists

New method for generating monochromatic light in storage rings

Soft, stretchy electrode simulates touch sensations using electrical signals

Relevant PhysicsForums posts

Related Stories

Experts to create predictive tool to tackle hate crime in Los Angeles

The promise and perils of predictive policing based on big data

Chicago tries to learn from New York crime fighting success

A new, cheap and fast IT system predicts crimes better organizes police shifts

Predictive policing substantially reduces crime in Los Angeles during months-long test

The 'Ferguson effect' or too many guns? Exploring the rise in violent crime in Chicago

Recommended for you

The beginnings of fashion: Paleolithic eyed needles and the evolution of dress

Study finds motivation to compete is stronger with in-group members than with outsiders

We date and marry people who are attractive as we are, new analysis finds

Lie-detection AI could provoke people into making careless accusations, researchers warn

Sharing false political information on social media may be associated with positive schizotypy, research suggests

Behavioral and computational study shows that social preferences can be inferred from decision speed alone

Newsletter sign up

Donate and enjoy an ad-free experience