Data mining of public SNP databases for the selection of intragenic SNPs

J. Aerts, Y. Wetzels, N. Cohen, J. Aerssens

Research output: Contribution to journalArticleAcademicpeer-review

25 Citations (Scopus)


Different strategies to search public single nucleotide polymorphism (SNP) databases for intragenic SNPs were evaluated. First, we assembled a strategy to annotate SNPs onto candidate genes based on a BLAST search of public SNP databases (Intragenic SNP Annotation by BLAST, ISAB). Only BLAST hits that complied with stringent criteria according to 1) percentage identity (minimum 98%), 2) BLAST hit length (the hit covers at least 98% of the length of the SNP entry in the database, or the hit is longer than 250 base pairs), and 3) location in non-repetitive DNA, were considered as valid SNPs. We assessed the intragenic context and redundancy of these SNPs, and demonstrated that the SNP content of the dbSNP and HGBASE/HGVbase databases are highly complementary but also overlap significantly. Second, we assessed the validity of intragenic SNP annotation available on the dbSNP and HGVbase websites by comparison with the results of the ISAB strategy. Only a minority of all annotated SNPs was found in common between the respective public SNP database websites and the ISAB annotation strategy. A detailed analysis was performed aiming to explain this discrepancy. As a conclusion, we recommend the application of an independent strategy (such as ISAB) to annotate intragenic SNPs, complementary to the annotation provided at the dbSNP and HGVbase websites. Such an approach might be useful in the selection process of intragenic SNPs for genotyping in genetic studies
Original languageEnglish
Pages (from-to)162-173
JournalHuman Mutation
Issue number3
Publication statusPublished - 2002


  • single-nucleotide polymorphisms
  • human genes
  • identification


Dive into the research topics of 'Data mining of public SNP databases for the selection of intragenic SNPs'. Together they form a unique fingerprint.

Cite this