Models to relate species to environment: a hierarchical statistical approac

T. Jamil

Research output: Thesisinternal PhD, WU

Abstract

In the last two decades, the interest of community ecologists in trait-based approaches has grown dramatically and these approaches have been increasingly applied to explain and predict response of species to environmental conditions. A variety of modelling techniques are available. The dominant technique is tocluster the species based on their functional traits and then summarize the response of  the clusters to environmental change. In general, fitting explicit models to data is always more informative and powerful than more informal approaches. The central theme of the thesis is how to quantify the relation of traits with the environment using three data tables, data on species occurrence and abundance in sites, data on traits of species and data on the environmental characteristics of sites. In this thesis, we place the challenge of quantifying trait-environment relationships in the context of species distribution modelling, so in the context of species-environment relationships. We present a hierarchal statistical approach to species distribution modelling that efficiently utilize the trait information and that is able to automatically select the relevant traits and environmental characteristics. This model-based approach, coupled with recent statistical developments and increased computing power, opens up possibilities that were unimaginable before.

In the present study a hierarchical statistical  approach is introduced for modeling and explaining species response along environmental gradients by species traits. The model is an extension of the generalized linear model with random terms that express the between-species variation in response to the environment. This so-called generalized linear mixed model (GLMM)is derived byintegrating a two-step procedure into one. As the basic GLMM we take the random intercept and random slope model. To introduce traits, the regression parameters (intercept and slope) are made linearly dependent on the species traits. As a consequence the trait-environment relationship is represented as an interaction term in the model. The method is illustrated using the famous Dune Meadow Data using Ellenberg indicator values as species traits.

Niche theory proclaims that species response to environmental gradients is nonlinear. Each species has preferred an environmental condition  in which it can survive and reproduce optimally. Thus each species tends to be most abundant around a specific environmental optimum and the distribution of species along any environmental gradient is usually unimodal, with the maximum at some ecological optimum.For presence-absence data, the simplest unimodal (non-negative) species response curve is the Gaussian logistic response curve with three parameters that characterize the niche: optimum (niche centre), tolerance (niche width) and maximum (expected occurrence at the centre).  Niches of species differ between species and species are assumed to be evolutionary adapted.  It is difficult to fit the Gaussian logistic model with linear trait submodels for the parameters with the available (generalized) nonlinear mixed model software.

We develop the trait-modulated Gaussian logistic model in which the niche parameters are made linearly dependent on species traits. The model is fitted to data in the Bayesian frameworkusing OpenBUGS (Bayesian inference Using Gibbs Sampling).A Bayesian variable selection method is used to identify which species traits and environmental variables best explain the species data through this model. We extended the approach to find the best linear combination of environmental variables.

We explained why and when (generalized) linear mixed models can effectively analyse unimodal data and presented a graphical tool and statistical test to test for unimodality while fitting just a generalized linear mixed model without any squared or other polynomial term. A GLMM is, of course, a linear model. Despite this fact, it can be used to detect unimodality and to fit unimodal data, with the provision that the differences in niche widthsamongspecies are not too large. As graphical tool we suggested to plot the random site effects against the environmental variable. There is an indication for unimodality, when this graph shows a quadratic relationship. The efficacy of GLMM to analyse unimodal data is illustrated by comparing the GLMM approach with an explicit unimodal model approach on simulated data and real data that show unimodality. 

When a system is described by a statistical model, model complexity leads to a very large computing time and poor estimation, especially if the number of predictors is large relative to the data size. As an alternative to and improvement over stepwise methods, shrinkage methods have been proposed. One of these is the Relevance vector machine (RVM). RVM assigns individual precisions to weights of predictors which are then estimated by maximizing the marginal likelihood (Type-II ML or empirical Bayes). We also investigated the selection properties of RVM both analytically and by experiments. We found that RVM is rather tolerant for predictors to stay in the model and concluded that RVM is not a real solution in high-dimensional data problems.

By further study the multi-trait and multi-environmental variablemodel selection method developed that used our previous study in a linear mixed model context. The method is called tiered forward selection. In the first tier, the random factors are selected, in the second, the fixed effects are selected and in the final tier non-significant terms are removed based on a modified Akaike information criterion. The linear mixed model with the tiered forward selection is compared with Type-II ML and existing methods for detecting trait-environment relationships that are not based on mixed models, namely the fourth corner method and the linear trait-environment method (LTE).

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
Supervisors/Advisors
  • ter Braak, Cajo, Promotor
Award date11 Jan 2012
Place of PublicationS.l.
Publisher
Print ISBNs9789461731395
Publication statusPublished - 2012

Keywords

  • statistics
  • linear models
  • interactions
  • traits
  • bayesian theory
  • plant ecology
  • biostatistics

Fingerprint Dive into the research topics of 'Models to relate species to environment: a hierarchical statistical approac'. Together they form a unique fingerprint.

  • Projects

    Cite this