How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering

Tim Koorevaar*, Johan H. Willemsen, Dominic Hildebrand, Richard G.F. Visser, Paul Arens, Chris Maliepaard

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Background: The allo-octoploid Fragaria x ananassa follows disomic inheritance, yet the high sequence similarity among its subgenomes can lead to misalignment of short sequencing reads (150 bp). This misalignment results in an increased number of erroneous variants during variant calling. To accurately associate traits with the appropriate subgenome, it is essential to filter out these erroneous variants. By classifying variants into correct (type 1) and erroneous types (homoeologous variants—type 2, and multi-locus variants—type 3), we can improve the reliability of downstream analyses. Results: Our analysis reveals that while erroneous variant types often display skewed average allele balances (AAB) for heterozygous calls, this measure alone is insufficient. To mitigate the erroneous variants further, we employed a Linkage Disequilibrium (LD) based filtering method that correlates highly (99%) with an approach that utilizes a genetic map from a biparental population. This combined filtering strategy—using both LD-based and average allele balance methods—resulted in the lowest switch error rate (0.037). Notably, our best filtering approach decreased phasing switch error rates by 44% and preserved 72% of the original dataset. Conclusions: The results indicate that identifying erroneous variants due to subgenome similarity can be effectively achieved without extensive genotyping of mapping populations. By implementing the LD-based filtering method, the phasing accuracy improved which improves the tracability of important alleles in the germplasm, paving the way for better understanding of trait associations in F. x ananassa.

Original languageEnglish
Article number1150
Number of pages15
JournalBMC Genomics
Volume25
Issue number1
DOIs
Publication statusPublished - 28 Nov 2024

Keywords

  • Allopolyploid
  • Average allele balance
  • Fragaria x ananassa
  • Linkage disequilibrium
  • Sequence similarity
  • Strawberry
  • Switch error rate
  • WGS
  • Whole genome sequencing

Fingerprint

Dive into the research topics of 'How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering'. Together they form a unique fingerprint.

Cite this