Mapping of cells and proteins improved with combined help of gamers and AI
Building on a map that shows hundreds of thousands of microscopic images of human cells, an international research team is working with the gaming community and with artificial intelligence to gain a more granular understanding of patterns of proteins arranged within cells.
The advances were reported by a collaboration between KTH Royal Institute of Technology, CCP Games and Massively Multiplayer Online Science.
The study is published in the September issue of Nature Biotechnology. The researchers report that gamers, or "citizen scientists," boosted the AI system used for predicting protein localization on a subcellular level. The combination of crowdsourcing and AI led to improved classification of subcellular protein patterns and the first-time identification of 10 new members of the family of cellular structures known as "rods and rings," says Emma Lundberg, a researcher from KTH who leads the Cell Atlas, part of the Human Protein Atlas, at the Science for Life joint research center.
Lundberg says the data is being actively integrated into the publicly-available Human Protein Atlas database, and will be a resource for researchers worldwide who are working toward a greater understanding of human cells, proteins and disease development.
The researchers partnered with Massively Multiplayer Online Science and CCP Games to integrate analysis of protein localization from the Human Protein Atlas images directly into EVE Online, a popular massively multiplayer online game. The resulting mini-game, called "Project Discovery," featured Lundberg's avatar, making her one of the first living scientists to be featured in an online game. It was played by more than 300,000 citizen scientists within EVE Online, who together generated more than 33 million image classifications of protein subcellular localization, an achievement hailed as a milestone in citizen science.
The ability of citizen scientists was compared to an AI system for predicting protein subcellular localization from images called the Localization Cellular Annotation Tool (Loc-CAT). Loc-CAT is the first generalized tool for annotating proteins with multiple localizations from images and generalizes across a large number of cell types, providing a useful tool in studying cells and their behavior in the future.
Although Loc-CAT outperformed players of Project Discovery for many of the common classes of proteins, aggregated player data from EVE Online better identified rare classes, and was able to annotate new patterns where training data was unavailable. By combining players' annotations with the machine learning approach, transfer-learning was used to boost the performance of Loc-CAT significantly.
"I believe that the integration of scientific tasks into established computer games will be a commonly used approach in the future to harness the brain processing power of humans, and that intricate designs of citizen science games feeding directly into machine learning models has the power to rapidly leverage the output of large-scale science efforts," Lundberg says. "We are grateful to all the citizen scientists who participated in this project, and for the discoveries they made."
Despite the success of this work, there is still large room for improvement. The researchers announced the Human Protein Atlas 2018 Challenge on Kaggle (www.kaggle.com/competitions) starting September 17. The challenge will involve image analysis to classify subcellular protein patterns in human cells. Thousands of dollars of prizes will be up for grabs and the contributions of participants will help drive the field of protein biology forward.