February 25, 2016

Massive data analysis helps uncover black women's experiences

It is often said that history is written by the victors. But it's probably more true to say it is written by the people who have the opportunity to write.

One example of this is the study of black women, their lives and their experiences. Documents recording the lives of black women are often historically obscure, hidden away in vast library collections and unintentionally misleadingly titled or cataloged. Other historical documents don't mention black women directly but may still offer clues. Until recently, researchers had no good way of recovering this "lost history" from either of these categories of documents.

Ruby Mendenhall, an associate professor of sociology, African American studies and urban and regional planning at the University of Illinois at Urbana-Champaign, is leading a collaboration of social scientists, humanities scholars and digital researchers that hopes to harness the power of high-performance computing to find and understand the historical experiences of black women by searching two massive databases of written works from the 18th through 20th centuries. The team also is developing a common toolbox that can help other digital humanities projects.

"With a Big Data approach we get a chance to make use of hundreds of thousands of texts—journals, books, periodicals," Mendenhall says. "The number is greater than what you would normally be able to look at during an entire career."

Powering up

Mendenhall's team realized that to search tens or even hundreds of thousands of books, articles and letters, they'd need considerably more computing power than available on a typical university computer cluster. They consulted with colleagues on campus who were members of the National Science Foundation (NSF)-supported Extreme Science and Engineering Discovery Environment (XSEDE), the most advanced collection of integrated advanced digital resources and services in the world. Those colleagues helped them identify the Blacklight supercomputer at the Pittsburgh Supercomputer Center (PSC) as a good fit for their project.

Blacklight (now retired) allowed the researchers to analyze 20,000 documents from the HathiTrust and JSTOR databases that were known to contain information about black women and to create a computational model based on this corpus of document. They are now using this model to study the entire 800,000 documents in both databases.

Words translated into numbers, graphics

To make sense of the huge datasets, the investigators turned to two sets of computational techniques: topic modeling and data visualization.

Topic modeling looks at how often certain keywords appear in connection with other terms. For example, a book that contains the word "negro"—at the time considered the most respectful term to describe black men and women—the word "vote" and the word "women" might offer clues about black women's participation in the women's suffrage movement. Mike Black, formerly at the University of Illinois and currently at the University of Massachusetts, headed the team's topic modeling project.

"We're hoping, in the next stage, to ramp up and check these topics against the larger corpus of works," Mendenhall adds.

Mark Van Moer, an XSEDE staff member at the University of Illinois's National Center for Supercomputing Applications, worked as the team's visualization specialist.

As part of the project, he built ways of displaying results that help make more intuitive sense of the data. For instance, a "tree map" displays key words in boxes that correspond to each word's frequency, whereas a "network graph" charts how often key words appear close to each other, also offering insight into how those words are being used and what they mean in context. Yet another visualization technique plots key terms in histographs that allow users to track the emergence and prominence of a given topic over time.

Making sense of the numbers

One aspect of the research involved explorations of the post-World War I Black Women's Club and the New Negro movement. A keyword search revealed that many of the documents that referenced one topic also referenced the other, confirming Mendenhall's prediction that these historical activities were linked. The finding raises interesting questions about how the two movements, which historians knew were contemporaneous, may have interacted. The Illinois researchers hope to begin answering these questions in their on-going work at PSC, as well as their proposed work on Bridges, an NSF-funded supercomputer coming online later this year.

"The beauty of computation and Big Data lies in how it complements the traditional close reading," says Nicole Brown, a postdoctoral fellow in Mendenhall's group who is interpreting the computational results in light of black feminist theory. "The two methods complement each other to give you a full picture of what's going on."

Van Moer adds that working with social science and humanities researchers "has been a real eye opener in a lot of ways. In the previous seven years, I pretty much worked with physical scientists. Humanities and social science researchers have to be worried about not just what the numbers mean at a surface level. They have a whole theory behind how you go about interpreting things as they relate to the larger society—that's really an interesting aspect of the project for me."

Another group goal is to create a set of computational tools that researchers in many fields will be able to help search various texts for topics of interest—and to understand how those topics interrelate. Topic modeling and visualization methods can be modules in a larger toolbox for digital humanities research.

"We're generally interested in black women and their life experience," Mendenhall says. "But we also see this as a tool that social scientists and people in the humanities can use to study many topics."

More information: Pittsburgh Supercomputing Center: www.psc.edu

PSC's new Bridges system: psc.edu/index.php/resources-fo … ng-resources/bridges

Provided by National Science Foundation

Citation: Massive data analysis helps uncover black women's experiences (2016, February 25) retrieved 24 April 2024 from https://phys.org/news/2016-02-massive-analysis-uncover-black-women.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New algorithm can separate unstructured text into topics with high accuracy and reproducibility

311 shares

Feedback to editors

Scientists map soil RNA to fungal genomes to understand forest ecosystems

24 minutes ago

Researchers show it's possible to teach old magnetic cilia new tricks

27 minutes ago

Mantle heat may have boosted Earth's crust 3 billion years ago

37 minutes ago

Study suggests that cells possess a hidden communication system

1 hour ago

Researcher finds that wood frogs evolved rapidly in response to road salts

1 hour ago

Imaging technique shows new details of peptide structures

1 hour ago

Cows' milk particles used for effective oral delivery of drugs

1 hour ago

New research confirms plastic production is directly linked to plastic pollution

1 hour ago

These giant, prehistoric salmon had tusk-like teeth

1 hour ago

Evolutionary biologists show that the color variants of female cuckoos are based on ancient mutations

1 hour ago

Load comments (0)

Massive data analysis helps uncover black women's experiences

Words translated into numbers, graphics

Making sense of the numbers

Scientists map soil RNA to fungal genomes to understand forest ecosystems

Researchers show it's possible to teach old magnetic cilia new tricks

Mantle heat may have boosted Earth's crust 3 billion years ago

Study suggests that cells possess a hidden communication system

Researcher finds that wood frogs evolved rapidly in response to road salts

Imaging technique shows new details of peptide structures

Cows' milk particles used for effective oral delivery of drugs

New research confirms plastic production is directly linked to plastic pollution

These giant, prehistoric salmon had tusk-like teeth

Evolutionary biologists show that the color variants of female cuckoos are based on ancient mutations

Relevant PhysicsForums posts

Passing variables in FORTRAN

My Website For Creating Interactive Visuals Linked To Equations

Number of Multiplications in the FFT Algorithm

Error logging in: onLoginSuccess is not a function

Latest Notable AI accomplishments

Building a homemade Long Short Term Memory with FSMs

New algorithm can separate unstructured text into topics with high accuracy and reproducibility

Engineering students fix common glitch in digitization of books published before 1700

GGC physicist leads team in innovative black hole research

Study: Racial gap in breast cancer diagnoses has closed

Black women less likely to benefit from early chemotherapy, study shows

If you're going to fall into a black hole, make sure it's rotating

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Massive data analysis helps uncover black women's experiences

Words translated into numbers, graphics

Making sense of the numbers

Scientists map soil RNA to fungal genomes to understand forest ecosystems

Researchers show it's possible to teach old magnetic cilia new tricks

Mantle heat may have boosted Earth's crust 3 billion years ago

Study suggests that cells possess a hidden communication system

Researcher finds that wood frogs evolved rapidly in response to road salts

Imaging technique shows new details of peptide structures

Cows' milk particles used for effective oral delivery of drugs

New research confirms plastic production is directly linked to plastic pollution

These giant, prehistoric salmon had tusk-like teeth

Evolutionary biologists show that the color variants of female cuckoos are based on ancient mutations

Relevant PhysicsForums posts

Related Stories

New algorithm can separate unstructured text into topics with high accuracy and reproducibility

Engineering students fix common glitch in digitization of books published before 1700

GGC physicist leads team in innovative black hole research

Study: Racial gap in breast cancer diagnoses has closed

Black women less likely to benefit from early chemotherapy, study shows

If you're going to fall into a black hole, make sure it's rotating

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience