The identification of allelic variation in potato

Research output: Thesisinternal PhD, WU

Abstract

The identification of haplotypes in tetraploid potato allows to improve genetic studies and facilitate marker-assisted selection. For many years, only bi-allelic molecular markers were used for application in genetic studies and they undoubtedly improved our understanding of the inheritance of important agronomical traits. However, these undertakings are complicated by the lack of knowledge about linkage between these SNPs and thus their underlying haplotype structure. The inability of geneticists to achieve haplotype reconstruction was mainly due to complications of the higher ploidy level of cultivated potato (2x =  4x = 48), as a single potato variety contains four copies of each chromosome (tetraploid). In this thesis, methods are described that allow haplotype reconstruction in tetraploid potato, either from sequencing data of a single variety or by use of SNP information over multiple varieties. We employed these methods on genotypic data of potato varieties and used the reconstructed haplotypes to detect which alleles influence traits such as plant maturity, tuber shape and flesh color.

The starting point of this thesis was a genetic study of the inheritance of potato tuber shape and eye depth. In Chapter 2 we identified a strong marker-trait association for tuber shape on potato chromosome 10 (Ro locus), that co-localises with a major effect QTL for eye depth. Subsequent fine mapping in a diploid full-sib potato population (C × E) refined the associated region of 3.1 Mb to a small region of 280 Kb. In this region, a repeat cluster of peroxidase genes is located.

In Chapter 3 we started with the development of methods for haplotype reconstruction. We introduced a novel method to use short-read DNA sequencing data to reconstruct haplotypes. A previous study genotyped ~800 potato genes in 83 tetraploid varieties using Illumina short reads. This information was used as input for our haplotype reconstruction pipeline and allowed us to generate haplotype blocks of  413 bp average in tetraploid potato, and estimate the haplotype diversity in potato. In addition, we performed a simulation study, which showed that our approach had superior accuracy compared to competing approaches.

A disadvantage of haplotype reconstruction with sequencing data is that only short-range haplotypes can be reconstructed. To facilitate the construction of long-range haplotypes, we developed in Chapter 4 a method that allows estimating haplotypes on basis of genetic information over multiple samples. This was achieved by first reconstructing linkage phase between SNP pairs, followed by the joining of these linkage phases into full-length haplotypes. We validated this method by use of pre-existing haplotypes of the StGWD1 gene. This validation study indicated that haplotype reconstruction is highly accurate. In addition, we employed our method on genotypic data of potato. The results show that the haplotype diversity in potato is extensive, but that a few common haplotypes are responsible for the majority of allelic variation.

In Chapter 5 we subsequently used these haplotypes to explore the application of haplotypes in a haplotype-based GWAS. Conventionally, GWAS is only performed with bi-allelic SNP markers, but knowledge of haplotype-specificity is required to interpret the resulting marker-trait associations. Here we performed haplotype-based GWAS and compared this to the results of single marker GWAS. We linked specific alleles to potato traits such as plant maturity, tuber shape, flesh color and potato tuber uniformity.

In Chapter 6 we report the development of Poly-Imputer. This tool allows to perform haplotype imputation and is based on the intuition that if the most or all segregating alleles are known it becomes trivial to assign four of these haplotypes to any individual. As input, we used a library of reference haplotypes and dosage calls of each variety. Application of this tool allowed to perform phasing of SNPs in progeny of a full-sib population, but more importantly also refine and improve haplotype solutions that are reconstructed with sequencing data and haplotypes based on dosage data.

Chapter 7 involves the determination of haplotype diversity at the StCDF1 gene, a key regulator of the tuberization response in potato. In this study, we performed haplotype assembly for the 2nd exon of this gene, followed by manual assignment of haplotypes by use of sequencing reads and genetic relations. In this study, we could demonstrate a significant phenotypic effect of only one StCDF1 allele.  

In the final chapter, we discuss the findings of the previous six chapters. In conclusion, this thesis provides a significant step for routine investigation of haplotype diversity in tetraploid potato. Hopefully, the methods and tools provided in this thesis will facilitate the use of haplotypes in marker-assisted selection and increase our understanding of allele-phenotype interactions in potato.

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
Supervisors/Advisors
  • Visser, Richard, Promotor
  • van Eck, Herman, Co-promotor
Award date21 Nov 2018
Place of PublicationWageningen
Publisher
Print ISBNs9789463435130
DOIs
Publication statusPublished - 2018

Fingerprint

haplotypes
potatoes
tetraploidy
tubers
alleles
linkage (genetics)
methodology
genes
marker-assisted selection
inheritance (genetics)
eyes

Cite this

Willemsen, Johan. / The identification of allelic variation in potato. Wageningen : Wageningen University, 2018. 206 p.
@phdthesis{776fad82389d467291cddcd743ad31ed,
title = "The identification of allelic variation in potato",
abstract = "The identification of haplotypes in tetraploid potato allows to improve genetic studies and facilitate marker-assisted selection. For many years, only bi-allelic molecular markers were used for application in genetic studies and they undoubtedly improved our understanding of the inheritance of important agronomical traits. However, these undertakings are complicated by the lack of knowledge about linkage between these SNPs and thus their underlying haplotype structure. The inability of geneticists to achieve haplotype reconstruction was mainly due to complications of the higher ploidy level of cultivated potato (2x =  4x = 48), as a single potato variety contains four copies of each chromosome (tetraploid). In this thesis, methods are described that allow haplotype reconstruction in tetraploid potato, either from sequencing data of a single variety or by use of SNP information over multiple varieties. We employed these methods on genotypic data of potato varieties and used the reconstructed haplotypes to detect which alleles influence traits such as plant maturity, tuber shape and flesh color. The starting point of this thesis was a genetic study of the inheritance of potato tuber shape and eye depth. In Chapter 2 we identified a strong marker-trait association for tuber shape on potato chromosome 10 (Ro locus), that co-localises with a major effect QTL for eye depth. Subsequent fine mapping in a diploid full-sib potato population (C × E) refined the associated region of 3.1 Mb to a small region of 280 Kb. In this region, a repeat cluster of peroxidase genes is located. In Chapter 3 we started with the development of methods for haplotype reconstruction. We introduced a novel method to use short-read DNA sequencing data to reconstruct haplotypes. A previous study genotyped ~800 potato genes in 83 tetraploid varieties using Illumina short reads. This information was used as input for our haplotype reconstruction pipeline and allowed us to generate haplotype blocks of  413 bp average in tetraploid potato, and estimate the haplotype diversity in potato. In addition, we performed a simulation study, which showed that our approach had superior accuracy compared to competing approaches. A disadvantage of haplotype reconstruction with sequencing data is that only short-range haplotypes can be reconstructed. To facilitate the construction of long-range haplotypes, we developed in Chapter 4 a method that allows estimating haplotypes on basis of genetic information over multiple samples. This was achieved by first reconstructing linkage phase between SNP pairs, followed by the joining of these linkage phases into full-length haplotypes. We validated this method by use of pre-existing haplotypes of the StGWD1 gene. This validation study indicated that haplotype reconstruction is highly accurate. In addition, we employed our method on genotypic data of potato. The results show that the haplotype diversity in potato is extensive, but that a few common haplotypes are responsible for the majority of allelic variation. In Chapter 5 we subsequently used these haplotypes to explore the application of haplotypes in a haplotype-based GWAS. Conventionally, GWAS is only performed with bi-allelic SNP markers, but knowledge of haplotype-specificity is required to interpret the resulting marker-trait associations. Here we performed haplotype-based GWAS and compared this to the results of single marker GWAS. We linked specific alleles to potato traits such as plant maturity, tuber shape, flesh color and potato tuber uniformity. In Chapter 6 we report the development of Poly-Imputer. This tool allows to perform haplotype imputation and is based on the intuition that if the most or all segregating alleles are known it becomes trivial to assign four of these haplotypes to any individual. As input, we used a library of reference haplotypes and dosage calls of each variety. Application of this tool allowed to perform phasing of SNPs in progeny of a full-sib population, but more importantly also refine and improve haplotype solutions that are reconstructed with sequencing data and haplotypes based on dosage data. Chapter 7 involves the determination of haplotype diversity at the StCDF1 gene, a key regulator of the tuberization response in potato. In this study, we performed haplotype assembly for the 2nd exon of this gene, followed by manual assignment of haplotypes by use of sequencing reads and genetic relations. In this study, we could demonstrate a significant phenotypic effect of only one StCDF1 allele.   In the final chapter, we discuss the findings of the previous six chapters. In conclusion, this thesis provides a significant step for routine investigation of haplotype diversity in tetraploid potato. Hopefully, the methods and tools provided in this thesis will facilitate the use of haplotypes in marker-assisted selection and increase our understanding of allele-phenotype interactions in potato.",
author = "Johan Willemsen",
note = "WU thesis 7095 Includes bibliographical references. - With summary in English",
year = "2018",
doi = "10.18174/459655",
language = "English",
isbn = "9789463435130",
publisher = "Wageningen University",
school = "Wageningen University",

}

Willemsen, J 2018, 'The identification of allelic variation in potato', Doctor of Philosophy, Wageningen University, Wageningen. https://doi.org/10.18174/459655

The identification of allelic variation in potato. / Willemsen, Johan.

Wageningen : Wageningen University, 2018. 206 p.

Research output: Thesisinternal PhD, WU

TY - THES

T1 - The identification of allelic variation in potato

AU - Willemsen, Johan

N1 - WU thesis 7095 Includes bibliographical references. - With summary in English

PY - 2018

Y1 - 2018

N2 - The identification of haplotypes in tetraploid potato allows to improve genetic studies and facilitate marker-assisted selection. For many years, only bi-allelic molecular markers were used for application in genetic studies and they undoubtedly improved our understanding of the inheritance of important agronomical traits. However, these undertakings are complicated by the lack of knowledge about linkage between these SNPs and thus their underlying haplotype structure. The inability of geneticists to achieve haplotype reconstruction was mainly due to complications of the higher ploidy level of cultivated potato (2x =  4x = 48), as a single potato variety contains four copies of each chromosome (tetraploid). In this thesis, methods are described that allow haplotype reconstruction in tetraploid potato, either from sequencing data of a single variety or by use of SNP information over multiple varieties. We employed these methods on genotypic data of potato varieties and used the reconstructed haplotypes to detect which alleles influence traits such as plant maturity, tuber shape and flesh color. The starting point of this thesis was a genetic study of the inheritance of potato tuber shape and eye depth. In Chapter 2 we identified a strong marker-trait association for tuber shape on potato chromosome 10 (Ro locus), that co-localises with a major effect QTL for eye depth. Subsequent fine mapping in a diploid full-sib potato population (C × E) refined the associated region of 3.1 Mb to a small region of 280 Kb. In this region, a repeat cluster of peroxidase genes is located. In Chapter 3 we started with the development of methods for haplotype reconstruction. We introduced a novel method to use short-read DNA sequencing data to reconstruct haplotypes. A previous study genotyped ~800 potato genes in 83 tetraploid varieties using Illumina short reads. This information was used as input for our haplotype reconstruction pipeline and allowed us to generate haplotype blocks of  413 bp average in tetraploid potato, and estimate the haplotype diversity in potato. In addition, we performed a simulation study, which showed that our approach had superior accuracy compared to competing approaches. A disadvantage of haplotype reconstruction with sequencing data is that only short-range haplotypes can be reconstructed. To facilitate the construction of long-range haplotypes, we developed in Chapter 4 a method that allows estimating haplotypes on basis of genetic information over multiple samples. This was achieved by first reconstructing linkage phase between SNP pairs, followed by the joining of these linkage phases into full-length haplotypes. We validated this method by use of pre-existing haplotypes of the StGWD1 gene. This validation study indicated that haplotype reconstruction is highly accurate. In addition, we employed our method on genotypic data of potato. The results show that the haplotype diversity in potato is extensive, but that a few common haplotypes are responsible for the majority of allelic variation. In Chapter 5 we subsequently used these haplotypes to explore the application of haplotypes in a haplotype-based GWAS. Conventionally, GWAS is only performed with bi-allelic SNP markers, but knowledge of haplotype-specificity is required to interpret the resulting marker-trait associations. Here we performed haplotype-based GWAS and compared this to the results of single marker GWAS. We linked specific alleles to potato traits such as plant maturity, tuber shape, flesh color and potato tuber uniformity. In Chapter 6 we report the development of Poly-Imputer. This tool allows to perform haplotype imputation and is based on the intuition that if the most or all segregating alleles are known it becomes trivial to assign four of these haplotypes to any individual. As input, we used a library of reference haplotypes and dosage calls of each variety. Application of this tool allowed to perform phasing of SNPs in progeny of a full-sib population, but more importantly also refine and improve haplotype solutions that are reconstructed with sequencing data and haplotypes based on dosage data. Chapter 7 involves the determination of haplotype diversity at the StCDF1 gene, a key regulator of the tuberization response in potato. In this study, we performed haplotype assembly for the 2nd exon of this gene, followed by manual assignment of haplotypes by use of sequencing reads and genetic relations. In this study, we could demonstrate a significant phenotypic effect of only one StCDF1 allele.   In the final chapter, we discuss the findings of the previous six chapters. In conclusion, this thesis provides a significant step for routine investigation of haplotype diversity in tetraploid potato. Hopefully, the methods and tools provided in this thesis will facilitate the use of haplotypes in marker-assisted selection and increase our understanding of allele-phenotype interactions in potato.

AB - The identification of haplotypes in tetraploid potato allows to improve genetic studies and facilitate marker-assisted selection. For many years, only bi-allelic molecular markers were used for application in genetic studies and they undoubtedly improved our understanding of the inheritance of important agronomical traits. However, these undertakings are complicated by the lack of knowledge about linkage between these SNPs and thus their underlying haplotype structure. The inability of geneticists to achieve haplotype reconstruction was mainly due to complications of the higher ploidy level of cultivated potato (2x =  4x = 48), as a single potato variety contains four copies of each chromosome (tetraploid). In this thesis, methods are described that allow haplotype reconstruction in tetraploid potato, either from sequencing data of a single variety or by use of SNP information over multiple varieties. We employed these methods on genotypic data of potato varieties and used the reconstructed haplotypes to detect which alleles influence traits such as plant maturity, tuber shape and flesh color. The starting point of this thesis was a genetic study of the inheritance of potato tuber shape and eye depth. In Chapter 2 we identified a strong marker-trait association for tuber shape on potato chromosome 10 (Ro locus), that co-localises with a major effect QTL for eye depth. Subsequent fine mapping in a diploid full-sib potato population (C × E) refined the associated region of 3.1 Mb to a small region of 280 Kb. In this region, a repeat cluster of peroxidase genes is located. In Chapter 3 we started with the development of methods for haplotype reconstruction. We introduced a novel method to use short-read DNA sequencing data to reconstruct haplotypes. A previous study genotyped ~800 potato genes in 83 tetraploid varieties using Illumina short reads. This information was used as input for our haplotype reconstruction pipeline and allowed us to generate haplotype blocks of  413 bp average in tetraploid potato, and estimate the haplotype diversity in potato. In addition, we performed a simulation study, which showed that our approach had superior accuracy compared to competing approaches. A disadvantage of haplotype reconstruction with sequencing data is that only short-range haplotypes can be reconstructed. To facilitate the construction of long-range haplotypes, we developed in Chapter 4 a method that allows estimating haplotypes on basis of genetic information over multiple samples. This was achieved by first reconstructing linkage phase between SNP pairs, followed by the joining of these linkage phases into full-length haplotypes. We validated this method by use of pre-existing haplotypes of the StGWD1 gene. This validation study indicated that haplotype reconstruction is highly accurate. In addition, we employed our method on genotypic data of potato. The results show that the haplotype diversity in potato is extensive, but that a few common haplotypes are responsible for the majority of allelic variation. In Chapter 5 we subsequently used these haplotypes to explore the application of haplotypes in a haplotype-based GWAS. Conventionally, GWAS is only performed with bi-allelic SNP markers, but knowledge of haplotype-specificity is required to interpret the resulting marker-trait associations. Here we performed haplotype-based GWAS and compared this to the results of single marker GWAS. We linked specific alleles to potato traits such as plant maturity, tuber shape, flesh color and potato tuber uniformity. In Chapter 6 we report the development of Poly-Imputer. This tool allows to perform haplotype imputation and is based on the intuition that if the most or all segregating alleles are known it becomes trivial to assign four of these haplotypes to any individual. As input, we used a library of reference haplotypes and dosage calls of each variety. Application of this tool allowed to perform phasing of SNPs in progeny of a full-sib population, but more importantly also refine and improve haplotype solutions that are reconstructed with sequencing data and haplotypes based on dosage data. Chapter 7 involves the determination of haplotype diversity at the StCDF1 gene, a key regulator of the tuberization response in potato. In this study, we performed haplotype assembly for the 2nd exon of this gene, followed by manual assignment of haplotypes by use of sequencing reads and genetic relations. In this study, we could demonstrate a significant phenotypic effect of only one StCDF1 allele.   In the final chapter, we discuss the findings of the previous six chapters. In conclusion, this thesis provides a significant step for routine investigation of haplotype diversity in tetraploid potato. Hopefully, the methods and tools provided in this thesis will facilitate the use of haplotypes in marker-assisted selection and increase our understanding of allele-phenotype interactions in potato.

U2 - 10.18174/459655

DO - 10.18174/459655

M3 - internal PhD, WU

SN - 9789463435130

PB - Wageningen University

CY - Wageningen

ER -