Projects per year
Abstract
Background  Most studies on genomic prediction with reference populations that include multiple lines or breeds have used linear models. Data heterogeneity due to using multiple populations may conflict with model assumptions used in linear regression methods.
Methods  In an attempt to alleviate potential discrepancies between assumptions of linear models and multipopulation data, two types of alternative models were used: (1) a multitrait genomic best linear unbiased prediction (GBLUP) model that modelled trait by line combinations as separate but correlated traits and (2) nonlinear models based on kernel learning. These models were compared to conventional linear models for genomic prediction for two lines of brown layer hens (B1 and B2) and one line of white hens (W1). The three lines each had 1004 to 1023 training and 238 to 240 validation animals. Prediction accuracy was evaluated by estimating the correlation between observed phenotypes and predicted breeding values.
Results  When the training dataset included only data from the evaluated line, nonlinear models yielded at best a similar accuracy as linear models. In some cases, when adding a distantly related line, the linear models showed a slight decrease in performance, while nonlinear models generally showed no change in accuracy. When only information from a closely related line was used for training, linear models and nonlinear radial basis function (RBF) kernel models performed similarly. The multitrait GBLUP model took advantage of the estimated genetic correlations between the lines. Combining linear and nonlinear models improved the accuracy of multiline genomic prediction.
Conclusions  Linear models and nonlinear RBF models performed very similarly for genomic prediction, despite the expectation that nonlinear models could deal better with the heterogeneous multipopulation data. This heterogeneity of the data can be overcome by modelling trait by line combinations as separate but correlated traits, which avoids the occasional occurrence of large negative accuracies when the evaluated line was not included in the training dataset. Furthermore, when using a multiline training dataset, nonlinear models provided information on the genotype data that was complementary to the linear models, which indicates that the underlying data distributions of the three studied lines were indeed heterogeneous.
Original language  English 

Article number  75 
Number of pages  11 
Journal  Genetics, Selection, Evolution 
Volume  46 
DOIs  
Publication status  Published  2014 
Keywords
 dairycattle breeds
 dimensionality reduction
 gaussian kernel
 accuracy
 traits
 values
 validation
 selection
 pedigree
 plant
Fingerprint Dive into the research topics of 'Genomic prediction based on data from three layer lines using nonlinear regression models'. Together they form a unique fingerprint.
Projects

AF16022 Breed4Food II (BO63001009, BO47001021, BO22.04025001, BO22.04011001, BO22.02011001)
1/01/14 → 31/12/21
Project: EZproject

