Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

H.H.D. Kerstens, S. Kollers, A. Kommandath, M. del Rosario, B.W. Dibbits, S.M. Kinders, R.P.M.A. Crooijmans, M.A.M. Groenen

Research output: Contribution to journalArticleAcademicpeer-review

16 Citations (Scopus)

Abstract

Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion - This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation.
Original languageEnglish
Article number4
Number of pages9
JournalBMC Genomics
Volume10
DOIs
Publication statusPublished - 2009

Fingerprint

Single Nucleotide Polymorphism
Swine
Genome
Benchmarking
Sequence Alignment
Firearms
Names
Sus scrofa
Genetic Markers
Computer Simulation

Keywords

  • multiple-sclerosis
  • snps
  • identification
  • association
  • consortium
  • diversity
  • discovery
  • variants
  • linkage
  • region

Cite this

Kerstens, H.H.D. ; Kollers, S. ; Kommandath, A. ; del Rosario, M. ; Dibbits, B.W. ; Kinders, S.M. ; Crooijmans, R.P.M.A. ; Groenen, M.A.M. / Mining for Single Nucleotide Polymorphisms in Pig genome sequence data. In: BMC Genomics. 2009 ; Vol. 10.
@article{d73f663d06184970814d7e959be3d9be,
title = "Mining for Single Nucleotide Polymorphisms in Pig genome sequence data",
abstract = "Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name {"}SDJVP{"}, and project name {"}Sino-Danish Pig Genome Project{"} were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion - This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80{\%} of the predicted SNPs represented true genetic variation.",
keywords = "multiple-sclerosis, snps, identification, association, consortium, diversity, discovery, variants, linkage, region",
author = "H.H.D. Kerstens and S. Kollers and A. Kommandath and {del Rosario}, M. and B.W. Dibbits and S.M. Kinders and R.P.M.A. Crooijmans and M.A.M. Groenen",
year = "2009",
doi = "10.1186/1471-2164-10-4",
language = "English",
volume = "10",
journal = "BMC Genomics",
issn = "1471-2164",
publisher = "Springer Verlag",

}

Mining for Single Nucleotide Polymorphisms in Pig genome sequence data. / Kerstens, H.H.D.; Kollers, S.; Kommandath, A.; del Rosario, M.; Dibbits, B.W.; Kinders, S.M.; Crooijmans, R.P.M.A.; Groenen, M.A.M.

In: BMC Genomics, Vol. 10, 4, 2009.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

AU - Kerstens, H.H.D.

AU - Kollers, S.

AU - Kommandath, A.

AU - del Rosario, M.

AU - Dibbits, B.W.

AU - Kinders, S.M.

AU - Crooijmans, R.P.M.A.

AU - Groenen, M.A.M.

PY - 2009

Y1 - 2009

N2 - Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion - This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation.

AB - Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole genome shotgun sequences obtained from the NCBI trace-repository with center name "SDJVP", and project name "Sino-Danish Pig Genome Project" were analysed for the presence of SNPs. Available BAC and BAC-end sequences and their naming and mapping information, all obtained from SangerInstitute FTP site, served as a rough assembly of a reference genome. In 1.2 Gb of pig genome sequence, we identified 98,151 SNPs in which one of the sequences in the alignment represented the polymorphism and 6,374 SNPs in which two sequences represent an identical polymorphism. To benchmark the SNP identification method, 163 SNPs, in which the polymorphism was represented twice in the sequence alignment, were selected and tested on a panel of three purebred boar lines and wild boar. Of these 163 in silico identified SNPs, 134 were shown to be polymorphic in our animal panel. Conclusion - This SNP identification method, which mines for SNPs in publicly available porcine shotgun sequences repositories, provides thousands of high quality SNPs. Benchmarking in an animal panel showed that more than 80% of the predicted SNPs represented true genetic variation.

KW - multiple-sclerosis

KW - snps

KW - identification

KW - association

KW - consortium

KW - diversity

KW - discovery

KW - variants

KW - linkage

KW - region

U2 - 10.1186/1471-2164-10-4

DO - 10.1186/1471-2164-10-4

M3 - Article

VL - 10

JO - BMC Genomics

JF - BMC Genomics

SN - 1471-2164

M1 - 4

ER -

Kerstens HHD, Kollers S, Kommandath A, del Rosario M, Dibbits BW, Kinders SM et al. Mining for Single Nucleotide Polymorphisms in Pig genome sequence data. BMC Genomics. 2009;10. 4. https://doi.org/10.1186/1471-2164-10-4