Genetic information from phased SNP array data can improve assemblies of whole genome sequences

S.C.L. Vanderzande, C.P. Peace

Research output: Chapter in Book/Report/Conference proceedingConference paperAcademicpeer-review

2 Citations (Scopus)

Abstract

Whole genome sequence (WGS) assemblies for horticultural crops are a valuable resource to improve our understanding of key horticultural traits. The last decade has seen a rise in the availability, quality and use of WGSs, yet issues with aligning contigs and resolving haplotypes remain. SNP arrays have also become commonly available and large data sets of high-quality, phased, SNP genotypic data have been generated. These data sets contain information on linkages among SNPs, allele presence in germplasm individuals, and allele germplasm origins and could therefore be a valuable additional resource to improve WGS assemblies – but are not fully exploited yet. To evaluate the quality of the haplotype-resolved WGS of ‘Gala’ and to demonstrate how SNP array data can contribute to WGS assemblies, phased ‘Gala’ SNP array data were compared to the ‘Gala’ WGSs. Genomic positions for the 8K SNP array SNPs were determined for each of the reported haplomes of ‘Gala’. Then, SNP genotypes of the ‘Gala’ SNP array data were compared with those of the ‘Gala’ WGS and parental origin was assigned to SNP alleles in each haplome. Each ‘Gala’ haplome was expected to exclusively contain either maternal or paternal haplotypes, yet all haplome homologs of each chromosome were composed of both. Multiple SNP genotype differences were observed, with either one of the expected parental alleles missing or the presence of an additional allele in a haplome. These results indicate that some ‘Gala’ WGS contigs had been misassembled and that maternal and paternal haplotypes had not originally been resolved at the chromosome level. We propose that available high-quality phased SNP array data, pedigree records, and extended shared haplotypes among individuals, arising from application of inheritance-based genetics principles, are employed to improve WGS assemblies.
Original languageEnglish
Title of host publicationXXXI International Horticultural Congress (IHC2022)
Subtitle of host publicationInternational Symposium on Breeding and Effective Use of Biotechnology and Molecular Tools in Horticultural Crops
EditorsV. Bus, M. Causse
PublisherISHS
Pages81-88
Number of pages8
ISBN (Print)9789462613614
DOIs
Publication statusPublished - 30 Mar 2023
Externally publishedYes

Publication series

NameActa Horticulturae
Number12
Volume1362
ISSN (Print)0567-7572
ISSN (Electronic)2406-6168

Fingerprint

Dive into the research topics of 'Genetic information from phased SNP array data can improve assemblies of whole genome sequences'. Together they form a unique fingerprint.

Cite this