Multi-omics approaches for biosynthetic pathway prediction in plants

Hernando Suarez

Research output: Thesisinternal PhD, WU


The field of plant natural product (NP) discovery has changed substantially since the isolation of morphine from opium poppy in 1806. The largest of these changes came in the last decade with the integration of computational genomics into the field, which resulted in the development of a plethora of computational methods that aid in the discovery of new plant NPs and the biosynthetic pathways, metabolites and enzymes associated with them. As more computational methods continue to be developed, and more plant genomes are sequenced, the NP discovery field is ripe with obstacles and opportunities uniquely suited to multi-omics solutions.

In chapter 1, we explore the history of the plant NP discovery field and highlight some of the strategies used throughout. We also discuss the state of the field and some of the obstacles and opportunities that computational genomics introduced, mainly regarding the identification of plant biosynthetic gene clusters (BGCs), analysis of coexpression networks and multi-omics integration.

In chapter 2, we present a computational tool for the automated identification, annotation and expression analysis of plant BGCs: plantiSMASH. Here, we show how BGC identification can guide plant NP discovery by mining all publicly available chromosome-level plant genome assemblies and recovering all BGCs experimentally characterized at the time, along with a wide range of putative novel ones. Furthermore, we develop a coexpression analysis module, which facilitates the integration of a transcriptomic analysis to any predicted BGC. In chapter 3, we leverage BGCs and BGC-like genomic structures to study the evolution of triterpene biosynthetic pathways in plants. Here, we queried 13 Brassicaceae plant genomes to identify all oxidosqualene cyclases (OSC), and the genomic regions flanking them. We use these regions to perform a series of phylogenetic analyses to compare the evolution of biosynthetic genes with that of the associated Brassicaceae species and uncover the most likely evolutionary events that led to the assembly and diversification of Brassicaceae triterpene BGCs. In chapter 4, we introduce CADE- HEroN: a workflow for comparative analysis of the coexpression networks of multiple species to guide the discovery of plant specialized metabolic (SM) pathways. We use this workflow to study the SM pathways associated with the phosphate starvation response in Arabidopsis thaliana, tomato and rice. This resulted in the identification of many genes of known and unknown function that have a conserved behavior under phosphate starvation across the three species. In chapter 5, we describe the development of an integrative multi-omics approach for plant SM pathway prediction: MEANtools. This computational tool integrates data from paired transcriptomic-metabolomic datasets to predict potential metabolic pathways, and the reactions, metabolites and genes involved in them. In this manner, MEANtools can help scientists generate testable hypotheses about biosynthetic pathways, which we showcase by using our pipeline with a recently published paired transcriptomic- metabolomic dataset.

We conclude this thesis in chapter 6, discussing how the plant NP discovery and SM research fields have benefitted from cross-pollination with adjacent fields, and how to better take advantage of this and the many other opportunities discussed throughout the the

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
  • de Ridder, Dick, Promotor
  • Medema, Marnix, Co-promotor
Award date8 Feb 2021
Place of PublicationWageningen
Print ISBNs9789463956697
Publication statusPublished - 8 Feb 2021


Dive into the research topics of 'Multi-omics approaches for biosynthetic pathway prediction in plants'. Together they form a unique fingerprint.

Cite this