July 23, 2018

Alarming error common in survey analyses

It is difficult to understate the importance of survey data: They tell us who we are and—in the hands of policymakers—what to do.

It had long been apparent to Brady West, an expert on survey methodology at the University of Michigan, Ann Arbor, that the benefits of survey data coexisted with a lack of training in how to interpret them correctly, especially when it came to secondary analyses—researchers reanalyzing survey data that had been collected by a previous study.

"In my consulting work for organizations and businesses, people would come in and say, 'Well, here's my estimate of how often something occurs in a population,' such as the rate of a disease or the preferences for a political party. And they'd want to know how to interpret that. I would respond, 'Have you accounted for weighting in the survey data you're using—or, did you account for the sample design?' And I would say, probably 90 percent of the time, they'd look at me and have no idea what I was talking about. They had never learned about the fundamental principles of working with survey data in their standard Intro to Stats classes."

As a survey methodologist, West wondered whether his experience was indicative of a systemic problem. There wasn't much in the academic literature to answer the question, so he and his colleagues, Joseph Sakshaug and Guy Aurelien, sampled 250 papers, reports and presentations—all available online, all conducting secondary analyses of survey data—to see if these analytic errors were, indeed, common.

"It was quite shocking," says West. "Only about half of these analyses claimed to account for weighting, the impact of sample designs on variance estimates was widely misunderstood and there was no sign of improvement in these problems over time." But possibly worst of all, these problems were just as prevalent in the peer-reviewed literature in their sample as they were in technical reports and conference presentations. "That's what was really most shocking to me," says West. "The peer-review process was not catching these errors."

An alarming example of what can happen when you compute an estimate but ignore the survey weighting can be found in the 2010 National Survey of College Graduates (NSCG). "This is a large national survey of college graduates, and they literally say in their documentation that they're oversampling individuals with science and engineering degrees," says West. "If you take account of the weighting, which corrects for this oversampling, about 30 percent of people are getting science and engineering degrees; if you forget about the weighting, you extrapolate the oversample to the entire population, and suddenly 55 percent of people have science and engineering degrees."

Ironically, better sampling of under-studied populations may be exacerbating the problem. "There's a lot of interest in under-represented populations, such as Hispanics," says West. "So, a lot of national surveys oversample these groups and others to create a big enough sample for researchers to adequately study. But when Average Joe Researcher grabs all the data—not just the data from the subpopulation they're interested in, but everybody, whites, African Americans, and Hispanics—and then they try to analyze all that data collectively, that's when oversampling can have a horrible effect on the overall picture if that feature of the sample design is not accounted for correctly in estimation."

There are many easy-to-use software tools that can easily account for the sampling and weighting complexities associated with survey data, but the fact they are not being used speaks to the underlying problem.

"This problem originates in the fact that the people publishing these articles just aren't told about any of this in their training," says West. "We've known about the importance of survey weighting for nearly a century—but somehow how to deal with weighted survey data hasn't penetrated the statistics classes that researchers take at the undergraduate or graduate level. We spend a fortune on doing national surveys—and who knows how much misinterpreting that data is costing us."

To solve that problem, West is helping design a MOOC (massive open online course) at the University of Michigan introducing statistics with the software Python. Weighting and correct survey analyses will be taught in the very first course of that specialization. "We're really focusing on making sure that before you jump into any analyses of survey data, you have a really firm understanding of how the data were collected and where they came from."

More information: JSM talk: http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/AbstractDetails.cfm?abstractid=326973

Study link: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0158120

Provided by American Statistical Association

Citation: Alarming error common in survey analyses (2018, July 23) retrieved 17 July 2024 from https://phys.org/news/2018-07-alarming-error-common-survey-analyses.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

How can one person completely change the results of a survey?

24 shares

Feedback to editors

Alarming error common in survey analyses

Lice cause significant harm to cage-free poultry, study finds

Organic compounds show promise as cheaper alternatives to metal photocatalysts

High-speed camera for molecules: Entangled photons enable Raman spectroscopy

Smart soil can water and feed itself

Modular design: New insights into protein factories in human mitochondria

Influenza viruses can use a second entry pathway to infect cells, study shows

Enzyme-powered 'snot bots' help deliver drugs in sticky situations

Research tracks 66 million years of mammalian diversity

Study finds persistent proteins may influence metabolomics results

A new approach to accelerate the discovery of quantum materials

Relevant PhysicsForums posts

Question about partial derivative relations for complex numbers

Views On Complex Numbers

Can you solve unknown triangle from shared hypotenuse?

Sharing Ratio -- A shop sells a mix of small chocolate bars and large chocolate bars

Understanding why πr^2 works for different area calculations

Implication vs Equivalence

How can one person completely change the results of a survey?

A survey needs to involve how many people before I'm convinced?

Survey reports employment characteristics for more than 200 fields of study

BES reveals longstanding flaw in opinion polls

Pre-election polls not becoming less reliable: study

Novel method developed for estimating prevalence of diabetes

Merging AI and human efforts to tackle complex mathematical problems

New mathematical proof helps to solve equations with random components

Study finds cooperation can still evolve even with limited payoff memory

Study shows the power of social connections to predict hit songs

Wire-cut forensic examinations currently too unreliable for court, new study says

How can we make good decisions by observing others? A videogame and computational model have the answer

Medical Xpress

Tech Xplore

Science X

Alarming error common in survey analyses

Lice cause significant harm to cage-free poultry, study finds

Organic compounds show promise as cheaper alternatives to metal photocatalysts

High-speed camera for molecules: Entangled photons enable Raman spectroscopy

Smart soil can water and feed itself

Modular design: New insights into protein factories in human mitochondria

Influenza viruses can use a second entry pathway to infect cells, study shows

Enzyme-powered 'snot bots' help deliver drugs in sticky situations

Research tracks 66 million years of mammalian diversity

Study finds persistent proteins may influence metabolomics results

A new approach to accelerate the discovery of quantum materials

Relevant PhysicsForums posts

Related Stories

How can one person completely change the results of a survey?

A survey needs to involve how many people before I'm convinced?

Survey reports employment characteristics for more than 200 fields of study

BES reveals longstanding flaw in opinion polls

Pre-election polls not becoming less reliable: study

Novel method developed for estimating prevalence of diabetes

Recommended for you

Merging AI and human efforts to tackle complex mathematical problems

New mathematical proof helps to solve equations with random components

Study finds cooperation can still evolve even with limited payoff memory

Study shows the power of social connections to predict hit songs

Wire-cut forensic examinations currently too unreliable for court, new study says

How can we make good decisions by observing others? A videogame and computational model have the answer

Newsletter sign up

Donate and enjoy an ad-free experience