MIBiG 2.0: a repository for biosynthetic gene clusters of known function

S.A. Kautsar, Kai Blin, Simon Shaw, J.C. Navarro Munoz, Barbara Terlouw, J.J.J. van der Hooft, Jeffrey A. Van Santen, V. Tracanna, Hernando Suarez Duran, V. Pascal Andreu, Nelly Selem Mojica, Mohammad Alanjary, Serina Robinson, George Lund, Samuel C. Epstein, Ashley C. Sisto, Louise K. Charkoudian, Jérôme Collemare, Roger G. Linington, Tilmann Weber & 1 others M.H. Medema

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)

Abstract

Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.
Original languageEnglish
Pages (from-to)D454-D458
Number of pages4
JournalNucleic acids research
Volume48
Issue numberD1
Early online date15 Oct 2019
DOIs
Publication statusPublished - 8 Jan 2020

Fingerprint

Multigene Family
Microbial Genome
Chemical Databases
Data Mining
Explosions
Microbiota
Drug Discovery
Ecology
Compliance
Genome
Databases
Technology

Cite this

Kautsar, S.A. ; Blin, Kai ; Shaw, Simon ; Navarro Munoz, J.C. ; Terlouw, Barbara ; van der Hooft, J.J.J. ; Van Santen, Jeffrey A. ; Tracanna, V. ; Suarez Duran, Hernando ; Pascal Andreu, V. ; Selem Mojica, Nelly ; Alanjary, Mohammad ; Robinson, Serina ; Lund, George ; Epstein, Samuel C. ; Sisto, Ashley C. ; Charkoudian, Louise K. ; Collemare, Jérôme ; Linington, Roger G. ; Weber, Tilmann ; Medema, M.H. / MIBiG 2.0: a repository for biosynthetic gene clusters of known function. In: Nucleic acids research. 2020 ; Vol. 48, No. D1. pp. D454-D458.
@article{f81aad7a8645452a878cecf4de2e37f1,
title = "MIBiG 2.0: a repository for biosynthetic gene clusters of known function",
abstract = "Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.",
author = "S.A. Kautsar and Kai Blin and Simon Shaw and {Navarro Munoz}, J.C. and Barbara Terlouw and {van der Hooft}, J.J.J. and {Van Santen}, {Jeffrey A.} and V. Tracanna and {Suarez Duran}, Hernando and {Pascal Andreu}, V. and {Selem Mojica}, Nelly and Mohammad Alanjary and Serina Robinson and George Lund and Epstein, {Samuel C.} and Sisto, {Ashley C.} and Charkoudian, {Louise K.} and J{\'e}r{\^o}me Collemare and Linington, {Roger G.} and Tilmann Weber and M.H. Medema",
year = "2020",
month = "1",
day = "8",
doi = "10.1093/nar/gkz882",
language = "English",
volume = "48",
pages = "D454--D458",
journal = "Nucleic acids research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "D1",

}

Kautsar, SA, Blin, K, Shaw, S, Navarro Munoz, JC, Terlouw, B, van der Hooft, JJJ, Van Santen, JA, Tracanna, V, Suarez Duran, H, Pascal Andreu, V, Selem Mojica, N, Alanjary, M, Robinson, S, Lund, G, Epstein, SC, Sisto, AC, Charkoudian, LK, Collemare, J, Linington, RG, Weber, T & Medema, MH 2020, 'MIBiG 2.0: a repository for biosynthetic gene clusters of known function', Nucleic acids research, vol. 48, no. D1, pp. D454-D458. https://doi.org/10.1093/nar/gkz882

MIBiG 2.0: a repository for biosynthetic gene clusters of known function. / Kautsar, S.A.; Blin, Kai; Shaw, Simon; Navarro Munoz, J.C.; Terlouw, Barbara; van der Hooft, J.J.J.; Van Santen, Jeffrey A.; Tracanna, V.; Suarez Duran, Hernando; Pascal Andreu, V.; Selem Mojica, Nelly; Alanjary, Mohammad; Robinson, Serina; Lund, George; Epstein, Samuel C.; Sisto, Ashley C.; Charkoudian, Louise K.; Collemare, Jérôme; Linington, Roger G.; Weber, Tilmann; Medema, M.H.

In: Nucleic acids research, Vol. 48, No. D1, 08.01.2020, p. D454-D458.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - MIBiG 2.0: a repository for biosynthetic gene clusters of known function

AU - Kautsar, S.A.

AU - Blin, Kai

AU - Shaw, Simon

AU - Navarro Munoz, J.C.

AU - Terlouw, Barbara

AU - van der Hooft, J.J.J.

AU - Van Santen, Jeffrey A.

AU - Tracanna, V.

AU - Suarez Duran, Hernando

AU - Pascal Andreu, V.

AU - Selem Mojica, Nelly

AU - Alanjary, Mohammad

AU - Robinson, Serina

AU - Lund, George

AU - Epstein, Samuel C.

AU - Sisto, Ashley C.

AU - Charkoudian, Louise K.

AU - Collemare, Jérôme

AU - Linington, Roger G.

AU - Weber, Tilmann

AU - Medema, M.H.

PY - 2020/1/8

Y1 - 2020/1/8

N2 - Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.

AB - Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.

U2 - 10.1093/nar/gkz882

DO - 10.1093/nar/gkz882

M3 - Article

VL - 48

SP - D454-D458

JO - Nucleic acids research

JF - Nucleic acids research

SN - 0305-1048

IS - D1

ER -