Machine learning to further improve the decision which boar ejaculates to process into artificial insemination doses

Claudia Kamphuis*, Pascal Duenk, Roel Franciscus Veerkamp, Bram Visser, Gurnoor Singh, Annette Nigsch, Rudi Maria De Mol, Marleen Leonarda Wilhelmina Johanna Broekhuijse

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


Current artificial insemination (AI) laboratory practices assess semen quality of each boar ejaculate to decide which ones to process into AI doses. This decision is aided with two, world-wide used, motility parameters that come available through computer assisted semen analysis (CASA). This decision process, however, still results in AI doses with variable and sometimes suboptimal fertility outcomes (e.g., small litter size). The hypothesis was that the decision which ejaculates to process into AI doses can be improved by adding more data from CASA systems, and data from other sources, in combination with a data-driven model. Available data consisted of ejaculates that passed the initial decision, and thus, were processed into AI doses and used to inseminate sows. Data were divided into a training set (6793 records) and a validation set (1191 records) for model development, and an independent test set (1434 records) for performance assessment. Gradient Boosting Machine (GBM) models were developed to predict four fertility phenotypes of interest (gestation length, total number born, number born alive, and number of stillborn piglets). Each fertility phenotype was considered as a numeric and as a binary outcome parameter, totaling to eight different fertility phenotypes. Data used to further improve the decision process originated from four sources: 1) CASA information, 2) boar ejaculate information, 3) breeding value estimations, and 4) weather information. These data were used to create seven prediction sets, where each new set added parameters to the ones included in the previous set. The GBM models predicted fertility phenotypes with low correlations (for numeric phenotypes) and area under the curve values (for binary phenotypes) on the test data. Hence, results demonstrated that a combination of more data and GBM did not enable further improvement of the AI dose quality checks, resulting in the rejection of our hypothesis. However, our study revealed parameters affecting boar ejaculate fertility which were not used in today's decision process. These parameters (listed in the top 10 in at least four GBM models) included one parameter associated with boar ejaculate information, two with breeding value estimations, five with CASA information, and one with weather information. These parameters, therefore, should be further investigated for their potential value when assessing the quality of boar ejaculates in daily routine AI doses processing.

Original languageEnglish
Pages (from-to)112-121
Number of pages10
Publication statusPublished - Mar 2020


  • Boar semen
  • Fertility phenotypes
  • Machine learning
  • Prediction model

Fingerprint Dive into the research topics of 'Machine learning to further improve the decision which boar ejaculates to process into artificial insemination doses'. Together they form a unique fingerprint.

  • Projects

    Cite this