TY - GEN
T1 - Automated Machine Learning to Predict the Precursors of Plant Specialized Metabolites
AU - Capela, João
AU - Cheixo, João
AU - de Ridder, Dick
AU - Rocha, Miguel
AU - Dias, Oscar
PY - 2025/4/24
Y1 - 2025/4/24
N2 - Plants produce specialized metabolites, among others, to protect against biotic and abiotic stresses. Due to their diversity and bioactivity, these compounds have profound implications for the world economy, especially for the pharmaceutical and agrotechnology sectors. In spite of their importance, their biosynthesis is far from being understood. The automatic prediction of the precursors of these compounds, derived from primary metabolism, is relevant to expediting pathway discovery. Leveraging DeepMol’s automated machine learning engine, we find that regularized linear classifiers provide optimal, accurate, and accountable models for this task. They perform significantly better than state-of-the-art models while chemically explaining their predictions. The pipeline and models are available in the repository https://github.com/jcapels/SMPrecursorPredictor.
AB - Plants produce specialized metabolites, among others, to protect against biotic and abiotic stresses. Due to their diversity and bioactivity, these compounds have profound implications for the world economy, especially for the pharmaceutical and agrotechnology sectors. In spite of their importance, their biosynthesis is far from being understood. The automatic prediction of the precursors of these compounds, derived from primary metabolism, is relevant to expediting pathway discovery. Leveraging DeepMol’s automated machine learning engine, we find that regularized linear classifiers provide optimal, accurate, and accountable models for this task. They perform significantly better than state-of-the-art models while chemically explaining their predictions. The pipeline and models are available in the repository https://github.com/jcapels/SMPrecursorPredictor.
KW - biosynthesis
KW - machine learning
KW - plant specialized metabolites
U2 - 10.1007/978-3-031-87873-2_16
DO - 10.1007/978-3-031-87873-2_16
M3 - Conference paper
AN - SCOPUS:105004255483
SN - 9783031878725
T3 - Lecture Notes in Networks and Systems (LNNS)
SP - 153
EP - 162
BT - Practical Applications of Computational Biology and Bioinformatics, 18th International Conference, PACBB 2024
A2 - Cuadrado, Sara
A2 - Fdez-Riverola, Florentino
A2 - Alonso, Ángel Canal
A2 - Rocha, Miguel
A2 - Mohamad, Mohd Saberi
A2 - Gil-González, Ana Belén
PB - Springer
CY - Cham
T2 - 18th International Conference on Practical Applications of Computational Biology and Bioinformatics, PACBB 2024
Y2 - 26 June 2024 through 28 June 2024
ER -