January 22, 2014

Crowdsourcing a living map of world health

What if by collecting data from mobile medical apps on cell phones around the world, we could map significant problems and see the flu coming like a giant whirling hurricane? A team of engineers, biologists and medical researchers at the University of California, San Diego wants to leverage the widespread use of smart phone technology and cloud computing to build maps of large-scale health problems or environmental damage such as the concentration of heavy metals in drinking water. The idea is based on the principle that health, including infectious disease and environmental pollution, is a trackable geospatial event. The team is working towards developing a tricorder that could monitor both individual and environmental health. In phase one, citizen sensors will test their drinking water using a simple test strip device that automatically sends the test results to a central data server for analysis while telling the tester whether the water is safe to drink.

They also hope their crowdsourced research will inspire support from individual investors through Indiegogo, the first partnership between UC San Diego and a crowdfunding platform. Examples include paying for the deployment of a sensor to an individual in one of five developing countries or buying one to monitor their own water plus that of an individual in another country. The project is led by Dr. Eliah Aronoff-Spencer of UC San Diego School of Medicine and Qualcomm Institute research scientist Albert Yu-Min Lin.

Their quest to create a living map of world health with the sensors, including infectious disease and environmental pollution, was featured in a Jan. 14 story by Fast Company. We sat down with Andrew Huynh, a Ph.D. student in the Department of Computer Science and Engineering who is leading development of the team's data storage and analysis platform, to discuss the role of machine learning in the project. In machine learning, computers learn how to do something – how to distinguish a burial site from a pile of rocks in satellite images, for example – from the continuous input and analysis of data. When humans are in the loop as with crowdsourcing, Huynh says the picture can become hazy with inaccurate information. Part of his job is to train the computer to learn to distinguish accurate data from erroneous data.

Q: What is your role as "lead data scientist" on this project?

As lead data scientist I helped architect and develop our open health stack, a cloud analytics and storage platform that will be used to detect large-scale trends in data from sensors, individuals, and the environment. I also manage a small team of undergraduate researchers who work on various components as a way to gain knowledge of the latest best practices in programming.

Q: How did you end up in this line of research?

I went into my Ph.D. program as part of the Valley of the Khans project, which was supported by the National Geographic Society and aimed at finding the tomb of Genghis Khan in Mongolia. The project involved a nondestructive archaeological survey utilizing modern digital tools from a variety of disciplines, including digital imagery, computer vision, nondestructive surveying, and on-site digital archaeology. I focused on combining the application of machine learning to satellite images and human computation as a way to use human perception to help solve difficult problems.

Q: You raise an interesting idea there. Using human perception to solve difficult problems? Do you mean that human perception adds something digital images and computers miss?

Human perception is based on years upon years of complex pattern recognition and learning that we continue to do throughout our lives. There are many things are still very difficult for computers but very easy for humans, such as recognizing objects in images or understanding the intent behind a block of text. The concept of harnessing human perception, "human-based computation", uses the strengths of both computers and humans to achieve a symbiotic human-computer relationship to solve difficult problems.

Q: Why is crowdsourcing research a big topic these days? What are the challenges in considering the reliability of the data?

Researchers and businesses are beginning to see how crowdsourcing is a powerful tool to solve problems or obtain services, ideas, and content that would normally be completed by a traditional employee. The idea being that a collective of people, whether experts or non-experts, will give an aggregated result that is equal to or better than that of an individual expert. However, because these results are input by people who are prone to accidents or misperception rather than a deterministic machine, the quality of the data can vary from problem to problem often with no quantitative verification process. Determining the signal from the noise in these huge datasets is a massive problem.

Q: Is that your job as lead data scientist on this project to overcome this problem with the data? You're essentially teaching the computer how to distinguish good data from bad data, right? How do you do that?

Absolutely. Once our sensors are deployed we'll be getting data from all over the world allowing us to experiment with different methods to distinguish good data from bad data when it comes from hundreds if not thousands of different sources. We're starting small first and then slowly ramping up to efficiently deal with each problem as we see it.

Q: This shift towards a world in which individuals are sensors to be tracked and mapped raises considerable privacy questions. How do you think about privacy as a computer scientist trying to amass enough data to make inferences about trends in health and environmental pollution?

Privacy is a concern in every aspect of this field, especially when you're talking about someone's individual health data. How do we anonymize this data so we can't even trace it back to the country of origin and yet still get enough information to be statistically significant? Also, in machine learning, we need raw data straight from the original source to train the machine. The process of anonymizing data removes subtle differences that are important in the machine learning process. Anonymizing may tweak specific numbers or rip out whole sections of important information. Illnesses, for example, are often geographically anchored. If we rip that information out, how do we learn anything? So it's an ongoing problem.

Provided by University of California - San Diego

Citation: Crowdsourcing a living map of world health (2014, January 22) retrieved 21 September 2024 from https://phys.org/news/2014-01-crowdsourcing-world-health.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Understanding collective animal behavior may be in the eye of the computer

0 shares

Feedback to editors

Saturday Citations: Football metaphors in physics; vets treat adorable baby rhino's broken leg

34 minutes ago

New data science tool greatly speeds up molecular analysis of our environment

16 hours ago

AI tools help uncover enzyme mechanisms for lasso peptides

16 hours ago

Light momentum turns pure silicon from an indirect to a direct bandgap semiconductor

17 hours ago

Study reveals large ocean heat storage efficiency during the last deglaciation

18 hours ago

Citizen science collaboration yields precise data on exoplanet WASP-77 A b

18 hours ago

A possible explanation for the 'missing plastic problem': New detection technique finds microplastics in coral skeletons

18 hours ago

Genome sequence analysis identifies new driver of antimicrobial resistance

19 hours ago

Analysis of heterostructures for spintronics shows how two desired quantum-physical effects reinforce each other

19 hours ago

Evolved in the lab, found in nature: Uncovering hidden pH sensing abilities in microbial cultures

19 hours ago

Load comments (0)

Crowdsourcing a living map of world health

Q: What is your role as "lead data scientist" on this project?

Q: How did you end up in this line of research?

Q: You raise an interesting idea there. Using human perception to solve difficult problems? Do you mean that human perception adds something digital images and computers miss?

Q: Why is crowdsourcing research a big topic these days? What are the challenges in considering the reliability of the data?

Q: Is that your job as lead data scientist on this project to overcome this problem with the data? You're essentially teaching the computer how to distinguish good data from bad data, right? How do you do that?

Q: This shift towards a world in which individuals are sensors to be tracked and mapped raises considerable privacy questions. How do you think about privacy as a computer scientist trying to amass enough data to make inferences about trends in health and environmental pollution?

Saturday Citations: Football metaphors in physics; vets treat adorable baby rhino's broken leg

New data science tool greatly speeds up molecular analysis of our environment

AI tools help uncover enzyme mechanisms for lasso peptides

Light momentum turns pure silicon from an indirect to a direct bandgap semiconductor

Study reveals large ocean heat storage efficiency during the last deglaciation

Citizen science collaboration yields precise data on exoplanet WASP-77 A b

A possible explanation for the 'missing plastic problem': New detection technique finds microplastics in coral skeletons

Genome sequence analysis identifies new driver of antimicrobial resistance

Analysis of heterostructures for spintronics shows how two desired quantum-physical effects reinforce each other

Evolved in the lab, found in nature: Uncovering hidden pH sensing abilities in microbial cultures

Relevant PhysicsForums posts

Container shrinks at certain screen widths (CSS)

Unsolvable python code bug? (finding the difference between two input strings)

User-Defined Functions in Sql Server SSMS

Can Fortran 77 Code Be Used to Debug Python Code for Solving ODEs Using Radau5?

Help solving a geometrical matching issue with Graph Neural Networks

Zipping identical iterables

Understanding collective animal behavior may be in the eye of the computer

Robots as platforms?

Active learning model for computer predictions

Programming smart molecules: Machine-learning algorithms could make chemical reactions intelligent

NIH launches first phase of microbiome cloud project

Bioengineers researching smart cameras and sensors that mimic, exceed human capability

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Crowdsourcing a living map of world health

Q: What is your role as "lead data scientist" on this project?

Q: How did you end up in this line of research?

Q: You raise an interesting idea there. Using human perception to solve difficult problems? Do you mean that human perception adds something digital images and computers miss?

Q: Why is crowdsourcing research a big topic these days? What are the challenges in considering the reliability of the data?

Q: Is that your job as lead data scientist on this project to overcome this problem with the data? You're essentially teaching the computer how to distinguish good data from bad data, right? How do you do that?

Q: This shift towards a world in which individuals are sensors to be tracked and mapped raises considerable privacy questions. How do you think about privacy as a computer scientist trying to amass enough data to make inferences about trends in health and environmental pollution?

Saturday Citations: Football metaphors in physics; vets treat adorable baby rhino's broken leg

New data science tool greatly speeds up molecular analysis of our environment

AI tools help uncover enzyme mechanisms for lasso peptides

Light momentum turns pure silicon from an indirect to a direct bandgap semiconductor

Study reveals large ocean heat storage efficiency during the last deglaciation

Citizen science collaboration yields precise data on exoplanet WASP-77 A b

A possible explanation for the 'missing plastic problem': New detection technique finds microplastics in coral skeletons

Genome sequence analysis identifies new driver of antimicrobial resistance

Analysis of heterostructures for spintronics shows how two desired quantum-physical effects reinforce each other

Evolved in the lab, found in nature: Uncovering hidden pH sensing abilities in microbial cultures

Relevant PhysicsForums posts

Related Stories

Understanding collective animal behavior may be in the eye of the computer

Robots as platforms?

Active learning model for computer predictions

Programming smart molecules: Machine-learning algorithms could make chemical reactions intelligent

NIH launches first phase of microbiome cloud project

Bioengineers researching smart cameras and sensors that mimic, exceed human capability

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience