Spec2Bed: Innovative Metabolite Grouping to Learn structure and Function from Metabolomics Profiles

Project: PhD

Project Details


Plant specialized metabolites are molecules that regulate important biochemical processes such as growth and protection against stress. In crops, they determine key characteristics such as taste, and can benefit human health after consumption. To discover plant metabolites, mass spectrometry-based metabolomics profiles are routinely generated. However, finding the correct molecular structures and functions from these profiles is extremely inefficient, requiring extensive manual labour and expertise, or is currently often simply impossible. Consequently, typically ~95% of specialized metabolites remain unknown. Here, I propose an innovative machine learning approach to overcome these limitations. Existing computational metabolite annotation workflows focus on structural annotation of single molecules. However, many unknown molecules are not identical to anything known. Nevertheless, in many instances these unknowns partly resemble known structures, since they share metabolic pathways or are involved in similar biological processes, thus providing an inroad on their annotation. This structural and functional relatedness becomes visible in metabolomics profiles as similarity in spectral “fingerprints”. To target the 95% of chemical dark matter, I will combine approaches from two previously unrelated fields, natural language processing and metabolomics, to group metabolites based on spectral fingerprint similarity. The PhD student will lay the foundations for a more comprehensive and scalable annotation platform than currently available, enabling drastically improved classification of large numbers of yet unknown metabolites that partially resemble known structures. The PhD student will create an effective visualization of multi-sample comparisons to highlight differential chemistry, as well as potential novel chemistry representing novel biochemical scaffolds. Furthermore, my unique approach will enable the prediction of specialized metabolite origins and functionality. I expect this will increase structural and functional annotations in metabolomics profiles from 5% to >50% of metabolites. To validate my approach, the PhD student will investigate the grouping of various plant chemistry as well as food-derived and drug-derived metabolites in human biofluids, and link food-derived metabolites in human plasma, urine, and faeces to specific plant-based food intake. Ultimately, my innovative approach will provide the research community with a much-needed metabolite annotation tool, for example as a foundation for new chemistry-guided plant breeding strategies to produce healthy and nutritional crops.
Effective start/end date15/09/21 → …


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.