TY - ADVS
T1 - Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal
AU - ter Braak, C.J.F.
PY - 2022/9/6
Y1 - 2022/9/6
N2 - R-code of the paper "Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal with abstract:After applying canonical correspondence analysis to metagenomics data with hugely different library sizes (site totals) it became evident that Canoco and the R-packages ade4 and vegan can yield (at least up to 2022) very different P-values in statistical tests of the relationship between taxonomic composition (species composition) and predictors (environmental variables and/or treatments). The reason is that vegan and Canoco up to version 5.12 apply residualized response permutation (but ignore the model intercept), whereas ade4 applies predictor permutation. Predictor permutation, when extended to residualized predictor permutation, is applicable in partial constrained ordination. This paper shows by simulation that residualized response permutation can yield a very inflated Type I error rate, if the abundance data are both overdispersed and highly variable in site total. In contrast, residualized predictor permutation controlled the type I error rate and had good power, also when the predictors were skewed or binary. After square-root or log transformation of the abundance data, the differences between the permutation methods became small. Residualized predictor permutation is recommended, particularly in testing trait-environment relationships using double constrained correspondence analysis, because this method also critically depends on the species totals, which are generally highly variable.
AB - R-code of the paper "Testing environmental effects on taxonomic composition with canonical correspondence analysis: alternative permutation tests are not equal with abstract:After applying canonical correspondence analysis to metagenomics data with hugely different library sizes (site totals) it became evident that Canoco and the R-packages ade4 and vegan can yield (at least up to 2022) very different P-values in statistical tests of the relationship between taxonomic composition (species composition) and predictors (environmental variables and/or treatments). The reason is that vegan and Canoco up to version 5.12 apply residualized response permutation (but ignore the model intercept), whereas ade4 applies predictor permutation. Predictor permutation, when extended to residualized predictor permutation, is applicable in partial constrained ordination. This paper shows by simulation that residualized response permutation can yield a very inflated Type I error rate, if the abundance data are both overdispersed and highly variable in site total. In contrast, residualized predictor permutation controlled the type I error rate and had good power, also when the predictors were skewed or binary. After square-root or log transformation of the abundance data, the differences between the permutation methods became small. Residualized predictor permutation is recommended, particularly in testing trait-environment relationships using double constrained correspondence analysis, because this method also critically depends on the species totals, which are generally highly variable.
KW - canonical correspondence analysis
KW - permutation testing
KW - residualized predictor permutation
KW - residualized response permutation
KW - constrained ordination
KW - ecology
KW - applied statistics
KW - biostatistics
KW - statistics
U2 - 10.6084/m9.figshare.15016008
DO - 10.6084/m9.figshare.15016008
M3 - Software
PB - Wageningen University & Research
ER -