A computational framework to explore large-scale biosynthetic diversity

Jorge C. Navarro-Muñoz, Nelly Selem-Mojica, Michael W. Mullowney, Satria A. Kautsar, James H. Tryon, Elizabeth I. Parkinson, Emmanuel L.C. De Los Santos, Marley Yeong, Pablo Cruz-Morales, Sahar Abubucker, Arne Roeters, Wouter Lokhorst, Antonio Fernandez-Guerra, Luciana Teresa Dias Cappelini, Anthony W. Goering, Regan J. Thomson, William W. Metcalf, Neil L. Kelleher, Francisco Barona-Gomez, Marnix H. Medema*

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)

Abstract

Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.

Original languageEnglish
Pages (from-to)60-68
JournalNature Chemical Biology
Volume16
DOIs
Publication statusPublished - 25 Nov 2019

Fingerprint

Multigene Family
Biological Products
Cluster Analysis
Genome
Metabolomics
Workflow
Microbiota
Computational Biology
Genes
Software
Technology

Cite this

Navarro-Muñoz, Jorge C. ; Selem-Mojica, Nelly ; Mullowney, Michael W. ; Kautsar, Satria A. ; Tryon, James H. ; Parkinson, Elizabeth I. ; De Los Santos, Emmanuel L.C. ; Yeong, Marley ; Cruz-Morales, Pablo ; Abubucker, Sahar ; Roeters, Arne ; Lokhorst, Wouter ; Fernandez-Guerra, Antonio ; Cappelini, Luciana Teresa Dias ; Goering, Anthony W. ; Thomson, Regan J. ; Metcalf, William W. ; Kelleher, Neil L. ; Barona-Gomez, Francisco ; Medema, Marnix H. / A computational framework to explore large-scale biosynthetic diversity. In: Nature Chemical Biology. 2019 ; Vol. 16. pp. 60-68.
@article{cc1dbaf16b3942008c676ff504f6d696,
title = "A computational framework to explore large-scale biosynthetic diversity",
abstract = "Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.",
author = "Navarro-Mu{\~n}oz, {Jorge C.} and Nelly Selem-Mojica and Mullowney, {Michael W.} and Kautsar, {Satria A.} and Tryon, {James H.} and Parkinson, {Elizabeth I.} and {De Los Santos}, {Emmanuel L.C.} and Marley Yeong and Pablo Cruz-Morales and Sahar Abubucker and Arne Roeters and Wouter Lokhorst and Antonio Fernandez-Guerra and Cappelini, {Luciana Teresa Dias} and Goering, {Anthony W.} and Thomson, {Regan J.} and Metcalf, {William W.} and Kelleher, {Neil L.} and Francisco Barona-Gomez and Medema, {Marnix H.}",
year = "2019",
month = "11",
day = "25",
doi = "10.1038/s41589-019-0400-9",
language = "English",
volume = "16",
pages = "60--68",
journal = "Nature Chemical Biology",
issn = "1552-4450",
publisher = "Nature Publishing Group",

}

Navarro-Muñoz, JC, Selem-Mojica, N, Mullowney, MW, Kautsar, SA, Tryon, JH, Parkinson, EI, De Los Santos, ELC, Yeong, M, Cruz-Morales, P, Abubucker, S, Roeters, A, Lokhorst, W, Fernandez-Guerra, A, Cappelini, LTD, Goering, AW, Thomson, RJ, Metcalf, WW, Kelleher, NL, Barona-Gomez, F & Medema, MH 2019, 'A computational framework to explore large-scale biosynthetic diversity', Nature Chemical Biology, vol. 16, pp. 60-68. https://doi.org/10.1038/s41589-019-0400-9

A computational framework to explore large-scale biosynthetic diversity. / Navarro-Muñoz, Jorge C.; Selem-Mojica, Nelly; Mullowney, Michael W.; Kautsar, Satria A.; Tryon, James H.; Parkinson, Elizabeth I.; De Los Santos, Emmanuel L.C.; Yeong, Marley; Cruz-Morales, Pablo; Abubucker, Sahar; Roeters, Arne; Lokhorst, Wouter; Fernandez-Guerra, Antonio; Cappelini, Luciana Teresa Dias; Goering, Anthony W.; Thomson, Regan J.; Metcalf, William W.; Kelleher, Neil L.; Barona-Gomez, Francisco; Medema, Marnix H.

In: Nature Chemical Biology, Vol. 16, 25.11.2019, p. 60-68.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - A computational framework to explore large-scale biosynthetic diversity

AU - Navarro-Muñoz, Jorge C.

AU - Selem-Mojica, Nelly

AU - Mullowney, Michael W.

AU - Kautsar, Satria A.

AU - Tryon, James H.

AU - Parkinson, Elizabeth I.

AU - De Los Santos, Emmanuel L.C.

AU - Yeong, Marley

AU - Cruz-Morales, Pablo

AU - Abubucker, Sahar

AU - Roeters, Arne

AU - Lokhorst, Wouter

AU - Fernandez-Guerra, Antonio

AU - Cappelini, Luciana Teresa Dias

AU - Goering, Anthony W.

AU - Thomson, Regan J.

AU - Metcalf, William W.

AU - Kelleher, Neil L.

AU - Barona-Gomez, Francisco

AU - Medema, Marnix H.

PY - 2019/11/25

Y1 - 2019/11/25

N2 - Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.

AB - Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the ‘biosynthetic gene similarity clustering and prospecting engine’ (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the ‘core analysis of syntenic orthologues to prioritize natural product gene clusters’ (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.

U2 - 10.1038/s41589-019-0400-9

DO - 10.1038/s41589-019-0400-9

M3 - Article

VL - 16

SP - 60

EP - 68

JO - Nature Chemical Biology

JF - Nature Chemical Biology

SN - 1552-4450

ER -

Navarro-Muñoz JC, Selem-Mojica N, Mullowney MW, Kautsar SA, Tryon JH, Parkinson EI et al. A computational framework to explore large-scale biosynthetic diversity. Nature Chemical Biology. 2019 Nov 25;16:60-68. https://doi.org/10.1038/s41589-019-0400-9