Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato

Peter G. Vos, M.J. Paulo, Roeland E. Voorrips, Richard G.F. Visser, Herman J. van Eck*, Fred A. van Eeuwijk

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

31 Citations (Scopus)

Abstract

Key message: The number of SNPs required for QTL discovery is justified by the distance at which linkage disequilibrium has decayed. Simulations and real potato SNP data showed how to estimate and interpret LD decay.Abstract: The magnitude of linkage disequilibrium (LD) and its decay with genetic distance determine the resolution of association mapping, and are useful for assessing the desired numbers of SNPs on arrays. To study LD and LD decay in tetraploid potato, we simulated autotetraploid genotypes and used it to explore the dependence on: (1) the number of haplotypes in the population (the amount of genetic variation) and (2) the percentage of haplotype specific SNPs (hs-SNPs). Several estimators for short-range LD were explored, such as the average r2, median r2, and other percentiles of r2 (80, 90, and 95 %). For LD decay, we looked at LD½,90, the distance at which the short-range LD is halved when using the 90 % percentile of r2 at short range, as estimator for LD. Simulations showed that the performance of various estimators for LD decay strongly depended on the number of haplotypes, although the real value of LD decay was not influenced very much by this number. The estimator LD½,90 was chosen to evaluate LD decay in 537 tetraploid varieties. LD½,90 values were 1.5 Mb for varieties released before 1945 and 0.6 Mb in varieties released after 2005. LD½,90 values within three different subpopulations ranged from 0.7 to 0.9 Mb. LD½,90 was 2.5 Mb for introgressed regions, indicating large haplotype blocks. In pericentromeric heterochromatin, LD decay was negligible. This study demonstrates that several related factors influencing LD decay could be disentangled, that no universal approach can be suggested, and that the estimation of LD decay has to be performed with great care and knowledge of the sampled material.

Original languageEnglish
Pages (from-to)123-135
JournalTheoretical and Applied Genetics
Volume130
Issue number1
DOIs
Publication statusPublished - 2017

Fingerprint

Tetraploidy
Linkage Disequilibrium
Solanum tuberosum
linkage disequilibrium
tetraploidy
Single Nucleotide Polymorphism
deterioration
potatoes
Haplotypes
haplotypes

Cite this

@article{a7a453d012984b3db95c5e165af09e7c,
title = "Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato",
abstract = "Key message: The number of SNPs required for QTL discovery is justified by the distance at which linkage disequilibrium has decayed. Simulations and real potato SNP data showed how to estimate and interpret LD decay.Abstract: The magnitude of linkage disequilibrium (LD) and its decay with genetic distance determine the resolution of association mapping, and are useful for assessing the desired numbers of SNPs on arrays. To study LD and LD decay in tetraploid potato, we simulated autotetraploid genotypes and used it to explore the dependence on: (1) the number of haplotypes in the population (the amount of genetic variation) and (2) the percentage of haplotype specific SNPs (hs-SNPs). Several estimators for short-range LD were explored, such as the average r2, median r2, and other percentiles of r2 (80, 90, and 95 {\%}). For LD decay, we looked at LD½,90, the distance at which the short-range LD is halved when using the 90 {\%} percentile of r2 at short range, as estimator for LD. Simulations showed that the performance of various estimators for LD decay strongly depended on the number of haplotypes, although the real value of LD decay was not influenced very much by this number. The estimator LD½,90 was chosen to evaluate LD decay in 537 tetraploid varieties. LD½,90 values were 1.5 Mb for varieties released before 1945 and 0.6 Mb in varieties released after 2005. LD½,90 values within three different subpopulations ranged from 0.7 to 0.9 Mb. LD½,90 was 2.5 Mb for introgressed regions, indicating large haplotype blocks. In pericentromeric heterochromatin, LD decay was negligible. This study demonstrates that several related factors influencing LD decay could be disentangled, that no universal approach can be suggested, and that the estimation of LD decay has to be performed with great care and knowledge of the sampled material.",
author = "Vos, {Peter G.} and M.J. Paulo and Voorrips, {Roeland E.} and Visser, {Richard G.F.} and {van Eck}, {Herman J.} and {van Eeuwijk}, {Fred A.}",
year = "2017",
doi = "10.1007/s00122-016-2798-8",
language = "English",
volume = "130",
pages = "123--135",
journal = "Theoretical and Applied Genetics",
issn = "0040-5752",
publisher = "Springer Verlag",
number = "1",

}

Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato. / Vos, Peter G.; Paulo, M.J.; Voorrips, Roeland E.; Visser, Richard G.F.; van Eck, Herman J.; van Eeuwijk, Fred A.

In: Theoretical and Applied Genetics, Vol. 130, No. 1, 2017, p. 123-135.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Evaluation of LD decay and various LD-decay estimators in simulated and SNP-array data of tetraploid potato

AU - Vos, Peter G.

AU - Paulo, M.J.

AU - Voorrips, Roeland E.

AU - Visser, Richard G.F.

AU - van Eck, Herman J.

AU - van Eeuwijk, Fred A.

PY - 2017

Y1 - 2017

N2 - Key message: The number of SNPs required for QTL discovery is justified by the distance at which linkage disequilibrium has decayed. Simulations and real potato SNP data showed how to estimate and interpret LD decay.Abstract: The magnitude of linkage disequilibrium (LD) and its decay with genetic distance determine the resolution of association mapping, and are useful for assessing the desired numbers of SNPs on arrays. To study LD and LD decay in tetraploid potato, we simulated autotetraploid genotypes and used it to explore the dependence on: (1) the number of haplotypes in the population (the amount of genetic variation) and (2) the percentage of haplotype specific SNPs (hs-SNPs). Several estimators for short-range LD were explored, such as the average r2, median r2, and other percentiles of r2 (80, 90, and 95 %). For LD decay, we looked at LD½,90, the distance at which the short-range LD is halved when using the 90 % percentile of r2 at short range, as estimator for LD. Simulations showed that the performance of various estimators for LD decay strongly depended on the number of haplotypes, although the real value of LD decay was not influenced very much by this number. The estimator LD½,90 was chosen to evaluate LD decay in 537 tetraploid varieties. LD½,90 values were 1.5 Mb for varieties released before 1945 and 0.6 Mb in varieties released after 2005. LD½,90 values within three different subpopulations ranged from 0.7 to 0.9 Mb. LD½,90 was 2.5 Mb for introgressed regions, indicating large haplotype blocks. In pericentromeric heterochromatin, LD decay was negligible. This study demonstrates that several related factors influencing LD decay could be disentangled, that no universal approach can be suggested, and that the estimation of LD decay has to be performed with great care and knowledge of the sampled material.

AB - Key message: The number of SNPs required for QTL discovery is justified by the distance at which linkage disequilibrium has decayed. Simulations and real potato SNP data showed how to estimate and interpret LD decay.Abstract: The magnitude of linkage disequilibrium (LD) and its decay with genetic distance determine the resolution of association mapping, and are useful for assessing the desired numbers of SNPs on arrays. To study LD and LD decay in tetraploid potato, we simulated autotetraploid genotypes and used it to explore the dependence on: (1) the number of haplotypes in the population (the amount of genetic variation) and (2) the percentage of haplotype specific SNPs (hs-SNPs). Several estimators for short-range LD were explored, such as the average r2, median r2, and other percentiles of r2 (80, 90, and 95 %). For LD decay, we looked at LD½,90, the distance at which the short-range LD is halved when using the 90 % percentile of r2 at short range, as estimator for LD. Simulations showed that the performance of various estimators for LD decay strongly depended on the number of haplotypes, although the real value of LD decay was not influenced very much by this number. The estimator LD½,90 was chosen to evaluate LD decay in 537 tetraploid varieties. LD½,90 values were 1.5 Mb for varieties released before 1945 and 0.6 Mb in varieties released after 2005. LD½,90 values within three different subpopulations ranged from 0.7 to 0.9 Mb. LD½,90 was 2.5 Mb for introgressed regions, indicating large haplotype blocks. In pericentromeric heterochromatin, LD decay was negligible. This study demonstrates that several related factors influencing LD decay could be disentangled, that no universal approach can be suggested, and that the estimation of LD decay has to be performed with great care and knowledge of the sampled material.

U2 - 10.1007/s00122-016-2798-8

DO - 10.1007/s00122-016-2798-8

M3 - Article

VL - 130

SP - 123

EP - 135

JO - Theoretical and Applied Genetics

JF - Theoretical and Applied Genetics

SN - 0040-5752

IS - 1

ER -