Building a community-based open harmonised reference data repository for global crop mapping

Hendrik Boogaard*, Arun Kumar Pratihast, Juan Carlos Laso Bayas, Santosh Karanam, Steffen Fritz, Kristof Van Tricht, Jeroen Degerickxi, Sven Gilliams

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)


Reference data is key to produce reliable crop type and cropland maps. Although research projects, national and international programs as well as local initiatives constantly gather crop related reference data, finding, collecting, and harmonizing data from different sources is a challenging task. Furthermore, ethical, legal, and consent-related restrictions associated with data sharing represent a common dilemma faced by international research projects. We address these dilemmas by building a community-based, open, harmonised reference data repository at global extent, ready for model training or product validation. Our repository contains data from different sources such as the Group on Earth Observations Global Agricultural Monitoring Initiative (GEOGLAM) Joint Experiment for Crop Assessment and Monitoring (JECAM) sites, the Radiant MLHub, the Future Harvest (CGIAR) centers, the National Aeronautics and Space Administration Food Security and Agriculture Program (NASA Harvest), the International Institute for Applied Systems Analysis (IIASA) citizen science platforms (LACO-Wiki and Geo-Wiki), as well as from individual project contributions. Data of 2016 onwards were collected, harmonised, and annotated. The data sets spatial, temporal, and thematic quality were assessed applying rules developed in this research. Currently, the repository holds around 75 million harmonised observations with standardized metadata of which a large share is available to the public. The repository, funded by ESA through the WorldCereal project, can be used for either the calibration of image classification deep learning algorithms or the validation of Earth Observation generated products, such as global cropland extent and maize and wheat maps. We recommend continuing and institutionalizing this reference data initiative e.g. through GEOGLAM, and encouraging the community to publish land cover and crop type data following the open science and open data principles.

Original languageEnglish
Article numbere0287731
JournalPLoS ONE
Issue number7 July
Publication statusPublished - Jul 2023


Dive into the research topics of 'Building a community-based open harmonised reference data repository for global crop mapping'. Together they form a unique fingerprint.

Cite this