Prediction of metabolic status of dairy cows in early lactation with on-farm cow data and machine learning algorithms

Wei Xu, Ariette T.M. van Knegsel, Jacques J.M. Vervoort, Rupert M. Bruckmaier, Renny J. van Hoeij, Bas Kemp, Edoardo Saccenti

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Metabolic status of dairy cows in early lactation can be evaluated using the concentrations of plasma β-hydroxybutyrate (BHB), free fatty acids (FFA), glucose, insulin, and insulin-like growth factor 1 (IGF-1). These plasma metabolites and metabolic hormones, however, are difficult to measure on farm. Instead, easily obtained on-farm cow data, such as milk production traits, have the potential to predict metabolic status. Here we aimed (1) to investigate whether metabolic status of individual cows in early lactation could be clustered based on their plasma values and (2) to evaluate machine learning algorithms to predict metabolic status using on-farm cow data. Through lactation wk 1 to 7, plasma metabolites and metabolic hormones of 334 cows were measured weekly and used to cluster each cow into 1 of 3 clusters per week. The cluster with the greatest plasma BHB and FFA and the lowest plasma glucose, insulin, and IGF-1 was defined as poor metabolic status; the cluster with the lowest plasma BHB and FFA and the greatest plasma glucose, insulin, and IGF-1 was defined as good metabolic status; and the intermediate cluster was defined as average metabolic status. Most dairy cows were classified as having average or good metabolic status, and a limited number of cows had poor metabolic status (10–50 cows per lactation week). On-farm cow data, including dry period length, parity, milk production traits, and body weight, were used to predict good or average metabolic status with 8 machine learning algorithms. Random Forest (error rate ranging from 12.4 to 22.6%) and Support Vector Machine (SVM; error rate ranging from 12.4 to 20.9%) were the top 2 best-performing algorithms to predict metabolic status using on-farm cow data. Random Forest had a higher sensitivity (range: 67.8–82.9% during wk 1 to 7) and negative predictive value (range: 89.5–93.8%) but lower specificity (range: 76.7–88.5%) and positive predictive value (range: 58.1–78.4%) than SVM. In Random Forest, milk yield, fat yield, protein percentage, and lactose yield had important roles in prediction, but their rank of importance differed across lactation weeks. In conclusion, dairy cows could be clustered for metabolic status based on plasma metabolites and metabolic hormones. Moreover, on-farm cow data can predict cows in good or average metabolic status, with Random Forest and SVM performing best of all algorithms.

Original languageEnglish
Pages (from-to)10186-10201
JournalJournal of Dairy Science
Volume102
Issue number11
Early online date30 Aug 2019
DOIs
Publication statusPublished - Nov 2019

Fingerprint

artificial intelligence
early lactation
Lactation
dairy cows
cows
farms
prediction
Somatomedins
Nonesterified Fatty Acids
somatomedins
Milk
free fatty acids
Hormones
Insulin
insulin
Glucose
lactation
hormones
metabolites
glucose

Keywords

  • cattle
  • cluster analysis
  • energy metabolism
  • Random Forest

Cite this

@article{6973136ef11c4cd5bc1bab88c7c746c3,
title = "Prediction of metabolic status of dairy cows in early lactation with on-farm cow data and machine learning algorithms",
abstract = "Metabolic status of dairy cows in early lactation can be evaluated using the concentrations of plasma β-hydroxybutyrate (BHB), free fatty acids (FFA), glucose, insulin, and insulin-like growth factor 1 (IGF-1). These plasma metabolites and metabolic hormones, however, are difficult to measure on farm. Instead, easily obtained on-farm cow data, such as milk production traits, have the potential to predict metabolic status. Here we aimed (1) to investigate whether metabolic status of individual cows in early lactation could be clustered based on their plasma values and (2) to evaluate machine learning algorithms to predict metabolic status using on-farm cow data. Through lactation wk 1 to 7, plasma metabolites and metabolic hormones of 334 cows were measured weekly and used to cluster each cow into 1 of 3 clusters per week. The cluster with the greatest plasma BHB and FFA and the lowest plasma glucose, insulin, and IGF-1 was defined as poor metabolic status; the cluster with the lowest plasma BHB and FFA and the greatest plasma glucose, insulin, and IGF-1 was defined as good metabolic status; and the intermediate cluster was defined as average metabolic status. Most dairy cows were classified as having average or good metabolic status, and a limited number of cows had poor metabolic status (10–50 cows per lactation week). On-farm cow data, including dry period length, parity, milk production traits, and body weight, were used to predict good or average metabolic status with 8 machine learning algorithms. Random Forest (error rate ranging from 12.4 to 22.6{\%}) and Support Vector Machine (SVM; error rate ranging from 12.4 to 20.9{\%}) were the top 2 best-performing algorithms to predict metabolic status using on-farm cow data. Random Forest had a higher sensitivity (range: 67.8–82.9{\%} during wk 1 to 7) and negative predictive value (range: 89.5–93.8{\%}) but lower specificity (range: 76.7–88.5{\%}) and positive predictive value (range: 58.1–78.4{\%}) than SVM. In Random Forest, milk yield, fat yield, protein percentage, and lactose yield had important roles in prediction, but their rank of importance differed across lactation weeks. In conclusion, dairy cows could be clustered for metabolic status based on plasma metabolites and metabolic hormones. Moreover, on-farm cow data can predict cows in good or average metabolic status, with Random Forest and SVM performing best of all algorithms.",
keywords = "cattle, cluster analysis, energy metabolism, Random Forest",
author = "Wei Xu and {van Knegsel}, {Ariette T.M.} and Vervoort, {Jacques J.M.} and Bruckmaier, {Rupert M.} and {van Hoeij}, {Renny J.} and Bas Kemp and Edoardo Saccenti",
year = "2019",
month = "11",
doi = "10.3168/jds.2018-15791",
language = "English",
volume = "102",
pages = "10186--10201",
journal = "Journal of Dairy Science",
issn = "0022-0302",
publisher = "American Dairy Science Association",
number = "11",

}

Prediction of metabolic status of dairy cows in early lactation with on-farm cow data and machine learning algorithms. / Xu, Wei; van Knegsel, Ariette T.M.; Vervoort, Jacques J.M.; Bruckmaier, Rupert M.; van Hoeij, Renny J.; Kemp, Bas; Saccenti, Edoardo.

In: Journal of Dairy Science, Vol. 102, No. 11, 11.2019, p. 10186-10201.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Prediction of metabolic status of dairy cows in early lactation with on-farm cow data and machine learning algorithms

AU - Xu, Wei

AU - van Knegsel, Ariette T.M.

AU - Vervoort, Jacques J.M.

AU - Bruckmaier, Rupert M.

AU - van Hoeij, Renny J.

AU - Kemp, Bas

AU - Saccenti, Edoardo

PY - 2019/11

Y1 - 2019/11

N2 - Metabolic status of dairy cows in early lactation can be evaluated using the concentrations of plasma β-hydroxybutyrate (BHB), free fatty acids (FFA), glucose, insulin, and insulin-like growth factor 1 (IGF-1). These plasma metabolites and metabolic hormones, however, are difficult to measure on farm. Instead, easily obtained on-farm cow data, such as milk production traits, have the potential to predict metabolic status. Here we aimed (1) to investigate whether metabolic status of individual cows in early lactation could be clustered based on their plasma values and (2) to evaluate machine learning algorithms to predict metabolic status using on-farm cow data. Through lactation wk 1 to 7, plasma metabolites and metabolic hormones of 334 cows were measured weekly and used to cluster each cow into 1 of 3 clusters per week. The cluster with the greatest plasma BHB and FFA and the lowest plasma glucose, insulin, and IGF-1 was defined as poor metabolic status; the cluster with the lowest plasma BHB and FFA and the greatest plasma glucose, insulin, and IGF-1 was defined as good metabolic status; and the intermediate cluster was defined as average metabolic status. Most dairy cows were classified as having average or good metabolic status, and a limited number of cows had poor metabolic status (10–50 cows per lactation week). On-farm cow data, including dry period length, parity, milk production traits, and body weight, were used to predict good or average metabolic status with 8 machine learning algorithms. Random Forest (error rate ranging from 12.4 to 22.6%) and Support Vector Machine (SVM; error rate ranging from 12.4 to 20.9%) were the top 2 best-performing algorithms to predict metabolic status using on-farm cow data. Random Forest had a higher sensitivity (range: 67.8–82.9% during wk 1 to 7) and negative predictive value (range: 89.5–93.8%) but lower specificity (range: 76.7–88.5%) and positive predictive value (range: 58.1–78.4%) than SVM. In Random Forest, milk yield, fat yield, protein percentage, and lactose yield had important roles in prediction, but their rank of importance differed across lactation weeks. In conclusion, dairy cows could be clustered for metabolic status based on plasma metabolites and metabolic hormones. Moreover, on-farm cow data can predict cows in good or average metabolic status, with Random Forest and SVM performing best of all algorithms.

AB - Metabolic status of dairy cows in early lactation can be evaluated using the concentrations of plasma β-hydroxybutyrate (BHB), free fatty acids (FFA), glucose, insulin, and insulin-like growth factor 1 (IGF-1). These plasma metabolites and metabolic hormones, however, are difficult to measure on farm. Instead, easily obtained on-farm cow data, such as milk production traits, have the potential to predict metabolic status. Here we aimed (1) to investigate whether metabolic status of individual cows in early lactation could be clustered based on their plasma values and (2) to evaluate machine learning algorithms to predict metabolic status using on-farm cow data. Through lactation wk 1 to 7, plasma metabolites and metabolic hormones of 334 cows were measured weekly and used to cluster each cow into 1 of 3 clusters per week. The cluster with the greatest plasma BHB and FFA and the lowest plasma glucose, insulin, and IGF-1 was defined as poor metabolic status; the cluster with the lowest plasma BHB and FFA and the greatest plasma glucose, insulin, and IGF-1 was defined as good metabolic status; and the intermediate cluster was defined as average metabolic status. Most dairy cows were classified as having average or good metabolic status, and a limited number of cows had poor metabolic status (10–50 cows per lactation week). On-farm cow data, including dry period length, parity, milk production traits, and body weight, were used to predict good or average metabolic status with 8 machine learning algorithms. Random Forest (error rate ranging from 12.4 to 22.6%) and Support Vector Machine (SVM; error rate ranging from 12.4 to 20.9%) were the top 2 best-performing algorithms to predict metabolic status using on-farm cow data. Random Forest had a higher sensitivity (range: 67.8–82.9% during wk 1 to 7) and negative predictive value (range: 89.5–93.8%) but lower specificity (range: 76.7–88.5%) and positive predictive value (range: 58.1–78.4%) than SVM. In Random Forest, milk yield, fat yield, protein percentage, and lactose yield had important roles in prediction, but their rank of importance differed across lactation weeks. In conclusion, dairy cows could be clustered for metabolic status based on plasma metabolites and metabolic hormones. Moreover, on-farm cow data can predict cows in good or average metabolic status, with Random Forest and SVM performing best of all algorithms.

KW - cattle

KW - cluster analysis

KW - energy metabolism

KW - Random Forest

U2 - 10.3168/jds.2018-15791

DO - 10.3168/jds.2018-15791

M3 - Article

VL - 102

SP - 10186

EP - 10201

JO - Journal of Dairy Science

JF - Journal of Dairy Science

SN - 0022-0302

IS - 11

ER -