February 5, 2021

Machine learning generates realistic genomes for imaginary humans

by Estonian Research Council

Machines, thanks to novel algorithms and advances in computer technology, can now learn complex models and even generate high-quality synthetic data such as photo-realistic images or even resumes of imaginary humans. A study recently published in the international journal PLOS Genetics uses machine learning to mine existing biobanks and generate chunks of human genomes which do not belong to real humans but have the characteristics of real genomes.

"Existing genomic databases are an invaluable resource for biomedical research, but they are either not publicly accessible or shielded behind long and exhausting application procedures due to valid ethical concerns. This creates a major scientific barrier for researchers. Machine-generated genomes, or artificial genomes as we call them, can help us overcome the issue within a safe ethical framework," said Burak Yelmen, first author of the study and Junior Research Fellow of Modern Population Genetics at the University of Tartu.

The pluridisciplinary team performed multiple analyses to assess the quality of the generated genomes compared to real ones. "Surprisingly, these genomes emerging from random noise mimic the complexities that we can observe within real human populations and, for most properties, they are not distinguishable from other genomes from the biobank we used to train our algorithm, except for one detail: they do not belong to any gene donor," said Dr. Luca Pagani, one of the senior authors of the study and a Mobilitas Pluss fellow.

The study additionally involves the assessment of the proximity of artificial genomes to real genomes to test whether the privacy of the original samples is preserved. "Although detecting privacy leaks among thousands of genomes could appear as looking for a needle in a haystack, combining multiple statistical measures allowed us to check all models carefully. Excitingly, the detailed exploration of complex leakage patterns can lead to improvements in generative model evaluation and design, and will fuel back the machine learning field," said Dr. Flora Jay, the coordinator of the study and CNRS researcher in the Interdisciplinary computer science laboratory (LRI/LISN, Université Paris-Saclay, French National Centre for Scientific Research).

All in all, machine learning approaches had provided faces, biographies and multiple other features to a handful of imaginary humans: now we know more about their biology. These imaginary humans with realistic genomes could serve as proxies for all the real genomes which are not publicly available or require long application procedures or collaborations, hence removing an important accessibility barrier in genomic research, in particular for underrepresented populations.

More information: Burak Yelmen et al, Creating artificial human genomes using generative neural networks, PLOS Genetics (2021). DOI: 10.1371/journal.pgen.1009303

Journal information: PLoS Genetics

Provided by Estonian Research Council

Citation: Machine learning generates realistic genomes for imaginary humans (2021, February 5) retrieved 20 April 2024 from https://phys.org/news/2021-02-machine-realistic-genomes-imaginary-humans.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Digging ancient signals out of modern human genomes

57 shares

Feedback to editors

Machine learning generates realistic genomes for imaginary humans

Saturday Citations: Irrationality modeled; genetic basis for PTSD; Tasmanian devils still endangered

Lemur's lament: When one vulnerable species stalks another

Study uncovers neural mechanisms underlying foraging behavior in freely moving animals

Scientists assess paths toward maintaining BC caribou until habitat recovers

European XFEL elicits secrets from an important nanogel

Chemists introduce new copper-catalyzed C-H activation strategy

Scientists discover new way to extract cosmological information from galaxy surveys

Compact quantum light processing: New findings lead to advances in optical quantum computing

Some plant-based steaks and cold cuts are lacking in protein, researchers find

Merging nuclear physics experiments and astronomical observations to advance equation-of-state research

Relevant PhysicsForums posts

If theres a 15% probability each month of getting a woman pregnant...

Can four legged animals drink from beneath their feet?

Mold in Plastic Water Bottles? What does it eat?

Dolphins don't breathe through their esophagus

Is this egg-laying or something else?

Color Recognition: What we see vs animals with a larger color range

Digging ancient signals out of modern human genomes

Technology takes a step forward in genetic research

Inferring human genomes at a fraction of the cost promises to boost biomedical research

Computationally classifying fungal lifestyles

DNA from an ancient, unidentified ancestor was passed down to humans living today

Expanding virophage diversity

Why zebrafish can regenerate damaged heart tissue, while other fish species cannot

Seeing is believing: Scientists reveal connectome of the fruit fly visual system

Uncovering key players in gene silencing: Insights into plant growth and human diseases

Light show in living cells: New method allows simultaneous fluorescent labeling of many proteins

Key protein regulates immune response to viruses in mammal cells

RNA's hidden potential: New study unveils its role in early life and future bioengineering

Medical Xpress

Tech Xplore

Science X

Machine learning generates realistic genomes for imaginary humans

Saturday Citations: Irrationality modeled; genetic basis for PTSD; Tasmanian devils still endangered

Lemur's lament: When one vulnerable species stalks another

Study uncovers neural mechanisms underlying foraging behavior in freely moving animals

Scientists assess paths toward maintaining BC caribou until habitat recovers

European XFEL elicits secrets from an important nanogel

Chemists introduce new copper-catalyzed C-H activation strategy

Scientists discover new way to extract cosmological information from galaxy surveys

Compact quantum light processing: New findings lead to advances in optical quantum computing

Some plant-based steaks and cold cuts are lacking in protein, researchers find

Merging nuclear physics experiments and astronomical observations to advance equation-of-state research

Relevant PhysicsForums posts

Related Stories

Digging ancient signals out of modern human genomes

Technology takes a step forward in genetic research

Inferring human genomes at a fraction of the cost promises to boost biomedical research

Computationally classifying fungal lifestyles

DNA from an ancient, unidentified ancestor was passed down to humans living today

Expanding virophage diversity

Recommended for you

Why zebrafish can regenerate damaged heart tissue, while other fish species cannot

Seeing is believing: Scientists reveal connectome of the fruit fly visual system

Uncovering key players in gene silencing: Insights into plant growth and human diseases

Light show in living cells: New method allows simultaneous fluorescent labeling of many proteins

Key protein regulates immune response to viruses in mammal cells

RNA's hidden potential: New study unveils its role in early life and future bioengineering

Newsletter sign up

Donate and enjoy an ad-free experience