A pipeline for high throughput detection and mapping of SNPs from EST databases

A.M. Anithakumari, Jifeng Tang, H.J. van Eck, R.G.F. Visser, J.A.M. Leunissen, B. Vosman, C.G. van der Linden

Research output: Contribution to journalArticleAcademicpeer-review

37 Citations (Scopus)

Abstract

Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation. Electronic supplementary material The online version of this article (doi:10.1007/s11032-009-9377-5) contains supplementary material, which is available to authorized users
Original languageEnglish
Pages (from-to)65-75
JournalMolecular Breeding
Volume26
Issue number1
DOIs
Publication statusPublished - 2010

Fingerprint

Expressed Sequence Tags
single nucleotide polymorphism
Single Nucleotide Polymorphism
Databases
genotyping
Solanum tuberosum
loci
potatoes
Population
genetic markers
Computational Biology
Diploidy
bioinformatics
electronics
diploidy
Software

Keywords

  • single-nucleotide polymorphisms
  • map-based cloning
  • linkage maps
  • genome
  • markers
  • potato
  • discovery
  • construction
  • varieties
  • haplotype

Cite this

@article{0e56b88ba0544c31ad266f834eeb54cc,
title = "A pipeline for high throughput detection and mapping of SNPs from EST databases",
abstract = "Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12{\%} dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation. Electronic supplementary material The online version of this article (doi:10.1007/s11032-009-9377-5) contains supplementary material, which is available to authorized users",
keywords = "single-nucleotide polymorphisms, map-based cloning, linkage maps, genome, markers, potato, discovery, construction, varieties, haplotype",
author = "A.M. Anithakumari and Jifeng Tang and {van Eck}, H.J. and R.G.F. Visser and J.A.M. Leunissen and B. Vosman and {van der Linden}, C.G.",
note = "online DO1:10.1007/s11032-009-9377-5",
year = "2010",
doi = "10.1007/s11032-009-9377-5",
language = "English",
volume = "26",
pages = "65--75",
journal = "Molecular Breeding",
issn = "1380-3743",
publisher = "Springer Verlag",
number = "1",

}

A pipeline for high throughput detection and mapping of SNPs from EST databases. / Anithakumari, A.M.; Tang, Jifeng; van Eck, H.J.; Visser, R.G.F.; Leunissen, J.A.M.; Vosman, B.; van der Linden, C.G.

In: Molecular Breeding, Vol. 26, No. 1, 2010, p. 65-75.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - A pipeline for high throughput detection and mapping of SNPs from EST databases

AU - Anithakumari, A.M.

AU - Tang, Jifeng

AU - van Eck, H.J.

AU - Visser, R.G.F.

AU - Leunissen, J.A.M.

AU - Vosman, B.

AU - van der Linden, C.G.

N1 - online DO1:10.1007/s11032-009-9377-5

PY - 2010

Y1 - 2010

N2 - Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation. Electronic supplementary material The online version of this article (doi:10.1007/s11032-009-9377-5) contains supplementary material, which is available to authorized users

AB - Single nucleotide polymorphisms (SNPs) represent the most abundant type of genetic variation that can be used as molecular markers. The SNPs that are hidden in sequence databases can be unlocked using bioinformatic tools. For efficient application of these SNPs, the sequence set should be error-free as much as possible, targeting single loci and suitable for the SNP scoring platform of choice. We have developed a pipeline to effectively mine SNPs from public EST databases with or without quality information using QualitySNP software, select reliable SNP and prepare the loci for analysis on the Illumina GoldenGate genotyping platform. The applicability of the pipeline was demonstrated using publicly available potato EST data, genotyping individuals from two diploid mapping populations and subsequently mapping the SNP markers (putative genes) in both populations. Over 7000 reliable SNPs were identified that met the criteria for genotyping on the GoldenGate platform. Of the 384 SNPs on the SNP array approximately 12% dropped out. For the two potato mapping populations 165 and 185 SNPs segregating SNP loci could be mapped on the respective genetic maps, illustrating the effectiveness of our pipeline for SNP selection and validation. Electronic supplementary material The online version of this article (doi:10.1007/s11032-009-9377-5) contains supplementary material, which is available to authorized users

KW - single-nucleotide polymorphisms

KW - map-based cloning

KW - linkage maps

KW - genome

KW - markers

KW - potato

KW - discovery

KW - construction

KW - varieties

KW - haplotype

U2 - 10.1007/s11032-009-9377-5

DO - 10.1007/s11032-009-9377-5

M3 - Article

VL - 26

SP - 65

EP - 75

JO - Molecular Breeding

JF - Molecular Breeding

SN - 1380-3743

IS - 1

ER -