Modelling of genotype by environment interaction and prediction of complex traits across multiple environments as a synthesis of crop growth modelling, genetics and statistics

Research output: Thesisinternal PhD, WUAcademic

Abstract

The main objective of plant breeders is to create and identify genotypes that are well-adapted to the target population of environments (TPE). The TPE corresponds to the future growing conditions in which the varieties produced by a breeding program will be grown. All possible genotypes that could be considered as selection candidates for a specific TPE are said to belong to the target population of genotypes, TPG. Genotypes commonly show different sensitivities to environmental gradients and then genotype by environment interaction (GxE) is observed. GxE can lead to changes in genotypic ranking, complicating the breeding process. The main aim of this thesis was to investigate statistical models and the combination of statistical and crop growth models to improve phenotype prediction across multiple environments. One aspect that determines the quality of phenotype prediction is the set of genotypes used to train the prediction model, especially when the TPG is structured. We proposed a method that uniformly covers the genetic space of the TPG, leading to a larger prediction accuracy than random sampling. We produced positive results for wheat, maize and rice. A second aspect that influences the accuracy of phenotype predictions is the choice of environments used to train the prediction model, which should capture the heterogeneity in the TPE. When accounting for heterogeneity in environmental quality, it is important to distinguish between repeatable and well predictable elements in the environmental conditions from those that are badly predictable. We proposed statistical methods based on the AMMI model and on mixed models to identify groups of environments that show repeatable GxE, illustrating our ideas with multi-environment wheat data in North-Western Europe. The importance of training set construction strategies and multi-environment genomic prediction models was also demonstrated for barley data. If breeders are interested in identifying the genetic basis of the target traits, it is advantageous to have a higher SNP density. In this thesis, we used exome sequence data of the EU-Whealbi-barley germplasm, which corresponds to a unique set of genotypes with a diverse origin, growth habit and breeding history. For this diverse data, we assessed the effects of QTLs and haplotypes across multiple environments for awn length, grain weight, heading date and plant height. Our results show that the EU-Whealbi-barley collection possesses a large diversity of promising alleles regulating the four traits we analysed. The last major topic addressed in this thesis is the use of a combination of statistical-genetic models and crop growth models (APSIM) as a strategy to assess the traits and phenotyping schemes to improve the prediction accuracy of a target trait like yield. We assess the potential of the combined modelling approach to characterize a sample of the TPG and TPE, and illustrate how trait correlations are modified by environmental conditions and by the genetic architecture of the sample of the TPE. We discuss the topics mentioned above, from a didactical perspective, proposing a list of subjects that should be covered in a GxE course for plant breeders. Finally, we discuss challenges and opportunities presented by the characterization of the TPE and TPG when using simulations based on statistical and crop growth models.

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
Supervisors/Advisors
  • van Eeuwijk, Fred, Promotor
  • Malosetti, M., Co-promotor
Award date15 Nov 2017
Place of PublicationWageningen
Publisher
Print ISBNs9789463436694
DOIs
Publication statusPublished - 2017

Fingerprint

statistics
synthesis
prediction
genotype
crops
crop models
growth models
phenotype
plant breeders
barley
breeding
wheat
environmental factors
environmental quality
growth habit
Western European region
heading
statistical models
sampling
quantitative trait loci

Keywords

  • crops
  • applied statistics
  • genotype environment interaction
  • complex loci
  • quantitative genetics

Cite this

@phdthesis{3d55965c795347a2b5f200c270418470,
title = "Modelling of genotype by environment interaction and prediction of complex traits across multiple environments as a synthesis of crop growth modelling, genetics and statistics",
abstract = "The main objective of plant breeders is to create and identify genotypes that are well-adapted to the target population of environments (TPE). The TPE corresponds to the future growing conditions in which the varieties produced by a breeding program will be grown. All possible genotypes that could be considered as selection candidates for a specific TPE are said to belong to the target population of genotypes, TPG. Genotypes commonly show different sensitivities to environmental gradients and then genotype by environment interaction (GxE) is observed. GxE can lead to changes in genotypic ranking, complicating the breeding process. The main aim of this thesis was to investigate statistical models and the combination of statistical and crop growth models to improve phenotype prediction across multiple environments. One aspect that determines the quality of phenotype prediction is the set of genotypes used to train the prediction model, especially when the TPG is structured. We proposed a method that uniformly covers the genetic space of the TPG, leading to a larger prediction accuracy than random sampling. We produced positive results for wheat, maize and rice. A second aspect that influences the accuracy of phenotype predictions is the choice of environments used to train the prediction model, which should capture the heterogeneity in the TPE. When accounting for heterogeneity in environmental quality, it is important to distinguish between repeatable and well predictable elements in the environmental conditions from those that are badly predictable. We proposed statistical methods based on the AMMI model and on mixed models to identify groups of environments that show repeatable GxE, illustrating our ideas with multi-environment wheat data in North-Western Europe. The importance of training set construction strategies and multi-environment genomic prediction models was also demonstrated for barley data. If breeders are interested in identifying the genetic basis of the target traits, it is advantageous to have a higher SNP density. In this thesis, we used exome sequence data of the EU-Whealbi-barley germplasm, which corresponds to a unique set of genotypes with a diverse origin, growth habit and breeding history. For this diverse data, we assessed the effects of QTLs and haplotypes across multiple environments for awn length, grain weight, heading date and plant height. Our results show that the EU-Whealbi-barley collection possesses a large diversity of promising alleles regulating the four traits we analysed. The last major topic addressed in this thesis is the use of a combination of statistical-genetic models and crop growth models (APSIM) as a strategy to assess the traits and phenotyping schemes to improve the prediction accuracy of a target trait like yield. We assess the potential of the combined modelling approach to characterize a sample of the TPG and TPE, and illustrate how trait correlations are modified by environmental conditions and by the genetic architecture of the sample of the TPE. We discuss the topics mentioned above, from a didactical perspective, proposing a list of subjects that should be covered in a GxE course for plant breeders. Finally, we discuss challenges and opportunities presented by the characterization of the TPE and TPG when using simulations based on statistical and crop growth models.",
keywords = "crops, applied statistics, genotype environment interaction, complex loci, quantitative genetics, gewassen, toegepaste statistiek, genotype-milieu interactie, complexe loci, kwantitatieve genetica",
author = "Daniela Bustos-Korts",
note = "WU thesis 6807 Includes bibliographical references. - With summary in English",
year = "2017",
doi = "10.18174/421321",
language = "English",
isbn = "9789463436694",
publisher = "Wageningen University",
school = "Wageningen University",

}

TY - THES

T1 - Modelling of genotype by environment interaction and prediction of complex traits across multiple environments as a synthesis of crop growth modelling, genetics and statistics

AU - Bustos-Korts, Daniela

N1 - WU thesis 6807 Includes bibliographical references. - With summary in English

PY - 2017

Y1 - 2017

N2 - The main objective of plant breeders is to create and identify genotypes that are well-adapted to the target population of environments (TPE). The TPE corresponds to the future growing conditions in which the varieties produced by a breeding program will be grown. All possible genotypes that could be considered as selection candidates for a specific TPE are said to belong to the target population of genotypes, TPG. Genotypes commonly show different sensitivities to environmental gradients and then genotype by environment interaction (GxE) is observed. GxE can lead to changes in genotypic ranking, complicating the breeding process. The main aim of this thesis was to investigate statistical models and the combination of statistical and crop growth models to improve phenotype prediction across multiple environments. One aspect that determines the quality of phenotype prediction is the set of genotypes used to train the prediction model, especially when the TPG is structured. We proposed a method that uniformly covers the genetic space of the TPG, leading to a larger prediction accuracy than random sampling. We produced positive results for wheat, maize and rice. A second aspect that influences the accuracy of phenotype predictions is the choice of environments used to train the prediction model, which should capture the heterogeneity in the TPE. When accounting for heterogeneity in environmental quality, it is important to distinguish between repeatable and well predictable elements in the environmental conditions from those that are badly predictable. We proposed statistical methods based on the AMMI model and on mixed models to identify groups of environments that show repeatable GxE, illustrating our ideas with multi-environment wheat data in North-Western Europe. The importance of training set construction strategies and multi-environment genomic prediction models was also demonstrated for barley data. If breeders are interested in identifying the genetic basis of the target traits, it is advantageous to have a higher SNP density. In this thesis, we used exome sequence data of the EU-Whealbi-barley germplasm, which corresponds to a unique set of genotypes with a diverse origin, growth habit and breeding history. For this diverse data, we assessed the effects of QTLs and haplotypes across multiple environments for awn length, grain weight, heading date and plant height. Our results show that the EU-Whealbi-barley collection possesses a large diversity of promising alleles regulating the four traits we analysed. The last major topic addressed in this thesis is the use of a combination of statistical-genetic models and crop growth models (APSIM) as a strategy to assess the traits and phenotyping schemes to improve the prediction accuracy of a target trait like yield. We assess the potential of the combined modelling approach to characterize a sample of the TPG and TPE, and illustrate how trait correlations are modified by environmental conditions and by the genetic architecture of the sample of the TPE. We discuss the topics mentioned above, from a didactical perspective, proposing a list of subjects that should be covered in a GxE course for plant breeders. Finally, we discuss challenges and opportunities presented by the characterization of the TPE and TPG when using simulations based on statistical and crop growth models.

AB - The main objective of plant breeders is to create and identify genotypes that are well-adapted to the target population of environments (TPE). The TPE corresponds to the future growing conditions in which the varieties produced by a breeding program will be grown. All possible genotypes that could be considered as selection candidates for a specific TPE are said to belong to the target population of genotypes, TPG. Genotypes commonly show different sensitivities to environmental gradients and then genotype by environment interaction (GxE) is observed. GxE can lead to changes in genotypic ranking, complicating the breeding process. The main aim of this thesis was to investigate statistical models and the combination of statistical and crop growth models to improve phenotype prediction across multiple environments. One aspect that determines the quality of phenotype prediction is the set of genotypes used to train the prediction model, especially when the TPG is structured. We proposed a method that uniformly covers the genetic space of the TPG, leading to a larger prediction accuracy than random sampling. We produced positive results for wheat, maize and rice. A second aspect that influences the accuracy of phenotype predictions is the choice of environments used to train the prediction model, which should capture the heterogeneity in the TPE. When accounting for heterogeneity in environmental quality, it is important to distinguish between repeatable and well predictable elements in the environmental conditions from those that are badly predictable. We proposed statistical methods based on the AMMI model and on mixed models to identify groups of environments that show repeatable GxE, illustrating our ideas with multi-environment wheat data in North-Western Europe. The importance of training set construction strategies and multi-environment genomic prediction models was also demonstrated for barley data. If breeders are interested in identifying the genetic basis of the target traits, it is advantageous to have a higher SNP density. In this thesis, we used exome sequence data of the EU-Whealbi-barley germplasm, which corresponds to a unique set of genotypes with a diverse origin, growth habit and breeding history. For this diverse data, we assessed the effects of QTLs and haplotypes across multiple environments for awn length, grain weight, heading date and plant height. Our results show that the EU-Whealbi-barley collection possesses a large diversity of promising alleles regulating the four traits we analysed. The last major topic addressed in this thesis is the use of a combination of statistical-genetic models and crop growth models (APSIM) as a strategy to assess the traits and phenotyping schemes to improve the prediction accuracy of a target trait like yield. We assess the potential of the combined modelling approach to characterize a sample of the TPG and TPE, and illustrate how trait correlations are modified by environmental conditions and by the genetic architecture of the sample of the TPE. We discuss the topics mentioned above, from a didactical perspective, proposing a list of subjects that should be covered in a GxE course for plant breeders. Finally, we discuss challenges and opportunities presented by the characterization of the TPE and TPG when using simulations based on statistical and crop growth models.

KW - crops

KW - applied statistics

KW - genotype environment interaction

KW - complex loci

KW - quantitative genetics

KW - gewassen

KW - toegepaste statistiek

KW - genotype-milieu interactie

KW - complexe loci

KW - kwantitatieve genetica

U2 - 10.18174/421321

DO - 10.18174/421321

M3 - internal PhD, WU

SN - 9789463436694

PB - Wageningen University

CY - Wageningen

ER -