Digitising the longest-running terrestrial insect database in the world

G.M. Gerrits, Roel van Klink

Research output: Contribution to conferencePosterAcademic

Abstract

The analysis of historical biodiversity data is crucial to understand the changes the biosphere has undergone over the past decades and even centuries. Much of these historical data are locked up in field notes, lab books and museum collections. To make the best use of such data, the error rate during the digitisation process should be kept as low as possible, lest an additional source of errors (and thus uncertainty) enters the data before analysis.

The Wijster dataset, consisting of ground beetles (Carabidae) collected by Biological station Wijster (Drenthe, NL) is the longest running time series of terrestrial invertebrates in the world, and is ongoing since 1959. This huge dataset of close to 1 million specimens was never fully digitised and is not openly accessible.

With financial support from NLBIF, we have set out to digitise the first part of the dataset (1959-1967), working out best practices, and laying down benchmarks for the digitisation of the entire dataset. The goal of our project is to bring together all available (meta)data, create a reference dataset with an extremely low error rate (<0.01%) and explore the possibilities of automating the further digitisation process with the use of self-learning algorithms. Our poster presents our workflow for the close-to perfect digitization of historical data.
Original languageEnglish
DOIs
Publication statusPublished - 3 May 2024
EventEOSC Empowering Biodiversity Research III Conference (EBR III) - Naturalis Biodiversity Center, Leiden, Netherlands
Duration: 25 Mar 202426 Mar 2024

Conference/symposium

Conference/symposiumEOSC Empowering Biodiversity Research III Conference (EBR III)
Country/TerritoryNetherlands
CityLeiden
Period25/03/2426/03/24

Fingerprint

Dive into the research topics of 'Digitising the longest-running terrestrial insect database in the world'. Together they form a unique fingerprint.

Cite this