Using a data lake in animal sciences

D. Schokker*, I.N. Athanasiadis, B. Visser, R.F. Veerkamp, C. Kamphuis

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference paperAcademicpeer-review

Abstract

In the livestock domain, Big Data is becoming more common and is being anchored into the mind-set of researchers. With the increasing availability of large amounts of data of varying nature, there is the challenge of how to store, combine, and analyse these data efficiently. With this study, we explored the possibility of using a data lake for storing and analysing sensor data, using an animal experiment as the use case, to improve scalability and interoperability. The use case was an experiment within Breed4Food (a public-private partnership), in which the gait score of 200 turkeys was determined. In the experiment, a gait score was traditionally assigned to each animal by a highly-skilled person who visually inspected them walking. Next to it, a set of sensor data streams was recorded for each animal, specifically inertial measurement units (IMUs), a 3D-video camera, and a force plate, with the ambition to explore the effectiveness of these data streams as predictors for estimating the gait score. The resulting sensor output, i.e. raw data, were successfully stored in its original format in the data lake. Subsequently, for each sensor output we performed extract, transform, and load activities, by executing custom-made scripts to generate tab or comma separated files. Lastly, by using Apache Spark it was possible to easily perform parallel processing of the data, allowing for fast computing. In conclusion, we managed to set up a data lake, load animal experimental data and run preliminary analyses. The data lake allowed for easy scale up of both data loading and analyses, which is desired for dynamic analyses pipelines, especially when more data are collected in the future.

Original languageEnglish
Title of host publicationPrecision Livestock Farming 2019
Subtitle of host publicationPapers Presented at the 9th European Conference on Precision Livestock Farming, ECPLF 2019
EditorsBernadette O'Brien, Deirdre Hennessy, Laurence Shalloo
Pages140-144
Number of pages5
ISBN (Electronic)9781841706542
Publication statusPublished - Aug 2019
Event9th European Conference on Precision Livestock Farming, ECPLF 2019 - Cork, Ireland
Duration: 26 Aug 201929 Aug 2019

Publication series

NamePrecision Livestock Farming 2019 - Papers Presented at the 9th European Conference on Precision Livestock Farming, ECPLF 2019

Conference

Conference9th European Conference on Precision Livestock Farming, ECPLF 2019
Country/TerritoryIreland
CityCork
Period26/08/1929/08/19

Keywords

  • Animal experiment
  • Data lake
  • Scalability
  • Sensor data

Fingerprint

Dive into the research topics of 'Using a data lake in animal sciences'. Together they form a unique fingerprint.

Cite this