Discrimination, lack of diversity, and societal risks of data mining highlighted in big data
A special issue of Big Data presents a series of insightful articles that focus on Big Data and Social and Technical Trade-Offs. Despite the dramatic growth in big data affecting many areas of research, industry, and society, there are risks associated with the design and use of data-driven systems. Among these are issues of discrimination, diversity, and bias, which are discussed in the papers in the special issue organized by Guest Editors Solon Barocas, Princeton University, danah boyd, Microsoft Research and Data & Society Research Institute, Sorelle Friedler, Haverford College, and Hanna Wallach, Microsoft Research and University of Massachusetts Amherst.
Coauthors Bettina Berendt, University of Leuven, Belgium and Soren Preibusch, Microsoft Research, Cambridge, U.K., focus on how and why discrimination can be a problem with big data and decisions made by humans based on data-mining. In the article "Toward Accountable Discrimination-Aware Data Mining: The Importance of Keeping the Human in the Loop—and Under the Looking Glass," the researchers present the results of a large-scale experiment in which human subjects described their reasoning for deciding whether or not a loan request should be granted. The authors offer strategies for making decision-making "discrimination-aware in an accountable way".
In the article entitled "Diversity in Big Data: A Review," coauthors Marina Drosou, Evaggelia Pitoura, University of Ioannina, Greece, HV Jagadish, University of Michigan, and Julia Stoyanovich, Drexel University, emphasize the risks big data may pose to society and individuals if it fails to account for diversity and potential discrimination. The authors discuss connections between diversity and fairness in big data systems research using specific applications such as the use of big data in matchmaking, crowdsourcing, and search and content recommendation.
Gina Neff, PhD, University of Oxford, U.K., Anissa Tanweer, Brittany Fiore-Gartland, PhD, and Laura Osburn, PhD, University of Washington, Seattle, discuss the multitude of challenges that confront data scientists and how they involve ethical issues in model development. In the article entitled "Critique and Contribute: A Practice-Based Framework for Improving Critical Data Studies and Data Science," the researchers highlight the importance of common criticisms of current data science. They present a strategy for addressing them, showing how to incorporate ethical issues in data science applications.
"This issue is a landmark for all data scientists," says Big Data Editor-in-Chief Vasant Dhar, Professor at the Stern School of Business and the Center for Data Science at New York University. "It is important for data scientists to be aware of the broader social and ethical aspects of their work, beginning with a critical analysis of the origins of data and issues such as the potential for creating further bias and discrimination in models derived from the data. It is important to understand how and why data were created, what they include and exclude, and when they might lead to systems that create undesirable consequences for society." He continues, "in this age of big data and automated decision-making systems, where there is often no human in the loop, it is important that we design such systems to take into account ethical and other social considerations associated with their use."