NG-Tax, a highly accurate and validated pipeline for analysis of 16S rRNA amplicons from complex biomes

Dataset

Description

Background Massive high-throughput sequencing of short, hypervariable segments of the 16S ribosomal RNA (rRNA) gene is transforming the methodological landscape describing microbial diversity within and across complex biomes. However, there is a strong need for standardisation as each new combination of experimental choices affects the results in different ways, restricting true meta-analyses. Results Here we present NG-Tax, a pipeline for 16S rRNA gene amplicon sequence analysis that was validated with different mock communities, specifically designed to challenge issues regarding optimization of routinely used filtering parameters. By sequencing two tandem variable 16S rRNA gene regions, V4 and V5-V6, in three separate sequencing runs on Illumina’s HiSeq2000 platform, the microbial composition of 49 independently amplified mock samples was characterized. This setup allowed for the evaluation of important factors of technical bias in taxonomic classification: 1) run-to-run sequencing variation 2) PCR – error 3) region/primer specific amplification bias. Despite the short read length (~140 nt) and all technical biases, the average specificity of the taxonomic assignment for the phylotypes included in the mock communities was 96%. On average 99.94% of the reads could be assigned to at least family level, while assignment to ‘spurious genera’ represented on average only 0.02% of the reads per sample. Pearson correlations between obtained and expected compositions at genus level were as high as 0.94, and Unifrac distance based PCoA plots confirmed biology guided clustering rather than the aforementioned technical aspects. Conclusions NG-Tax demonstrated improved qualitative and quantitative representation of the true sample composition. The high robustness of the pipeline against technical biases associated with 16S rRNA gene amplicon sequencing studies will additionally improve comparability between studies and facilitate efforts towards standardization.
Date made available11 Nov 2015
PublisherWageningen University

Cite this