New study reveals improved way to interpret high-throughput biological data
This study has developed a unique bioinformatics approach for identifying associations between molecules from a range of vast data sources. Applied to studies with the aim to measure metabolism in tissues under variating conditions e.g. genetics, diets and environment.
Opposed to current methods that apply statistical analysis to data sets as a whole, the proposed workflow breaks the initial data into smaller groups determined by known molecular interactions. Statistical methods can then be applied to these groups resulting in more accurate results than if the analysis had been applied to the whole dataset.
This technique has been shown to improve the detection of genes related to lipid metabolism on an example mouse nutritional study that increases our understanding of biochemical fluctuations by 15 per cent.
Identifying associations between metabolites, small molecules produced during metabolism, and genes is crucial to understanding processes in the cell. However, uncovering these relationships is a complex task, especially when integrating data that concern various types of molecules. Adding to this complexity is the vast quantity of data available for analysis, a result of the development of new experimental high-throughput techniques.
Initially, the molecular workflow will be applied to research into the benefits of broccoli for prostate cancer, in collaboration with the Institute of Food Research. As well as being applied to studying the health benefits of flavonoids, which are plant metabolites found in a variety of fruits and vegetables, in collaboration with the University of East Anglia.
By improving our capability to integrate data from various sources and identify links between metabolites and genes, this workflow will provide a more detailed diagnosis of cellular metabolism and gene expression in biological processes.
Co-author, Wiktor Jurkowski, Integrative Genomics Group Leader at TGAC, said: "Knowledge gathered in molecular networks can be harnessed to improve data integration and interpretation.
"Our approach, integrating transcriptomics and metabolomics data will help interpret signals measured by omics techniques to extend our knowledge of processes under specific biological conditions. Therefore, benefiting biologists in interpreting data, creating better hypothesises and pinpointing genes and metabolites involved to unravel the mechanism of interest.
"This is a proof-of-concept study and we are currently working towards improving the group generation strategy for spare areas of the interactome and less annotated species. We are applying this and other molecular network approaches to data generated in collaborative projects across Norwich Research Park."