June 22, 2017

Researchers review the state-of-the-art text mining technologies for chemistry

by Centro Nacional de Investigaciones Oncológicas (CNIO)

In a recent Chemical Reviews article, Spanish researchers have published the first exhaustive revision of the state-of-the-art methodologies underlying chemical search engines, named entity recognition and text mining systems.

The rapidly growing field of big data applications in biomedical research, together with the use of machine learning and artificial intelligence technologies for text data mining, has resulted in promising tools. The authors write, "This review is organised to serve as a practical guide to researchers entering in this field but also to help them to envision the next steps in this emerging data science field."

"Through the release of Gold Standard datasets and the organisation of several community challenge benchmark events, the Biological Text Mining Unit has played a critical role in the development and evaluation of current chemical text mining systems, as highlighted in this article," explains Martin Krallinger, head of the unit and co-first author of the review.

A huge amount of unstructured data

A considerable fraction of biomedically relevant data is only available in the form of unstructured data. This type of data includes rapidly growing scientific literature, medicinal chemistry patents, electronic health records and clinical trial documents. In fact, every year, over 20,000 new compounds are published in medicinal and biological chemistry journals.

Being able to transform unstructured biomedical research data into structured databases that can be more efficiently processed by machines or queried by humans is critical for a range of heterogeneous applications. These include the identification of new drug targets and chemical probes to validate/discard those new potential targets, re-purposing of approved drugs, the identification of adverse drug events or retrieval of systems biology associated with chemical-disease or chemical-gene networks.

As a therapeutic strategy to treat medical needs, chemical compounds constitute a key entity type of critical relevance for biomedical research. "The construction of large chemical knowledge bases, integrating chemical information with biological and clinical data, is crucial to identify and validate new therapeutic targets for unmet medical needs as well as to speed up the drug discovery process," says Julen Oyarzabal, director of Translational Sciences at CIMA and co-leader of this report.

More information: Martin Krallinger et al, Information Retrieval and Text Mining Technologies for Chemistry, Chemical Reviews (2017). DOI: 10.1021/acs.chemrev.6b00851

Journal information: Chemical Reviews

Provided by Centro Nacional de Investigaciones Oncológicas (CNIO)

Citation: Researchers review the state-of-the-art text mining technologies for chemistry (2017, June 22) retrieved 1 July 2024 from https://phys.org/news/2017-06-state-of-the-art-text-technologies-chemistry.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Team presents an online tool to extract drug toxicity information from text

3 shares

Feedback to editors

Researchers review the state-of-the-art text mining technologies for chemistry

A huge amount of unstructured data

Study finds timing of rainfall crucial for flood prediction

New workflow reveals composition and function of metabolic enzyme polymers

Waves of protein expression and phosphorylation rewire the yeast proteome during meiosis

Organ-on-a-chip mimics blood-brain barrier for better drug delivery

Machine learning algorithm proves to be highly accurate in predicting Mount St. Helens eruptions

Study advances understanding of two-state reactivity mechanism in iron-based catalysts

Deep dive into past climates paints grim outlook for fish species

Detecting lung cancer early with sugar-sensing nanotech

The evidence is mounting: Humans were responsible for the extinction of large mammals

Scientists show ribosomes play an unexpected role in blood vessel formation

Relevant PhysicsForums posts

Order of Reactions occurring in aqueous solutions

Storing chemicals on my balcony (storing in changing temps)

Gibbs energy for Lithiation in Lithium batteries

Diamond oxidation -- covalent bonds

Astatine's Interactions with Ion Exchange and Chromatography Resins

Reaction energy for a Lithium Iron Phosphate battery

Team presents an online tool to extract drug toxicity information from text

Text mining for chemists

New text-mining tool lets researchers visualize gene, protein, drug and disease connections

Digital Science transfers SureChem patent chemistry data to EMBL-EBI

Wide-Open accelerates release of scientific data by identifying overdue datasets

New algorithm can separate unstructured text into topics with high accuracy and reproducibility

First chemist in history may have been a female perfumer—how the science of scents has changed since

Chemist explores the real-world science of Star Wars

Some plant-based steaks and cold cuts are lacking in protein, researchers find

Scientists develop new machine learning method for modeling chemical reactions

Trio wins Nobel Prize in chemistry for work on quantum dots, used in electronics and medical imaging

Researchers create 3D-printed vegan seafood

Medical Xpress

Tech Xplore

Science X

Researchers review the state-of-the-art text mining technologies for chemistry

A huge amount of unstructured data

Study finds timing of rainfall crucial for flood prediction

New workflow reveals composition and function of metabolic enzyme polymers

Waves of protein expression and phosphorylation rewire the yeast proteome during meiosis

Organ-on-a-chip mimics blood-brain barrier for better drug delivery

Machine learning algorithm proves to be highly accurate in predicting Mount St. Helens eruptions

Study advances understanding of two-state reactivity mechanism in iron-based catalysts

Deep dive into past climates paints grim outlook for fish species

Detecting lung cancer early with sugar-sensing nanotech

The evidence is mounting: Humans were responsible for the extinction of large mammals

Scientists show ribosomes play an unexpected role in blood vessel formation

Relevant PhysicsForums posts

Related Stories

Team presents an online tool to extract drug toxicity information from text

Text mining for chemists

New text-mining tool lets researchers visualize gene, protein, drug and disease connections

Digital Science transfers SureChem patent chemistry data to EMBL-EBI

Wide-Open accelerates release of scientific data by identifying overdue datasets

New algorithm can separate unstructured text into topics with high accuracy and reproducibility

Recommended for you

First chemist in history may have been a female perfumer—how the science of scents has changed since

Chemist explores the real-world science of Star Wars

Some plant-based steaks and cold cuts are lacking in protein, researchers find

Scientists develop new machine learning method for modeling chemical reactions

Trio wins Nobel Prize in chemistry for work on quantum dots, used in electronics and medical imaging

Researchers create 3D-printed vegan seafood

Newsletter sign up

Donate and enjoy an ad-free experience