Skip to main navigation Skip to search Skip to main content

Strategies for community-sourced biocuration in bioinformatics: a case study on MIBiG 4.0

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Biocuration is essential to transform molecular sequence data into standardized, machine-readable resources. Such curated datasets enable comparative analysis, predictive modeling, and data integration across bioinformatics platforms. While professional biocuration is resource-intensive and usually limited to institutional settings, community-driven approaches can mobilize large-scale annotation of specialized datasets and are more resilient to disruptions in scientific funding. Here, we present a model for community-powered curation applied to the Minimum Information about a Biosynthetic Gene Cluster (MIBiG) repository. Through a framework of workflows for metadata capture, annotation validation, and contributor coordination, the MIBiG 4.0 initiative recruited 267 scientists across 178 institutions from 33 countries, volunteering an estimated 4000 h of work. These efforts expanded the MIBiG repository by 22% and enhanced its usability in downstream molecular data analyses in comparative genomic analyses, natural product discovery, and machine learning applications. We provide strategies and actionable lessons for adopting this model, supporting the sustainability of curated bioinformatics resources central to nucleic acid research and related fields.

Original languageEnglish
JournalBriefings in Bioinformatics
Volume26
Issue number6
DOIs
Publication statusPublished - 1 Nov 2025

Keywords

  • biocuration
  • data standards
  • genome mining
  • open science
  • secondary metabolites

Fingerprint

Dive into the research topics of 'Strategies for community-sourced biocuration in bioinformatics: a case study on MIBiG 4.0'. Together they form a unique fingerprint.

Cite this