Rarity in large data sets: Singletons, model values and the location of the species abundance distribution

G. Straatsma, S. Egli

Research output: Contribution to journalArticleAcademicpeer-review

6 Citations (Scopus)


Species abundance data in 12 large data sets, holding 10 × 103 to 125 × 106 individuals in 350 to 10 × 103 samples, were studied. Samples and subsets, for instance the summarized data of samples over years, and whole sets were analysed. Two methods of the binning of data, assigning abundance values to classes for histograms, have been applied in the past: bins of equal size and bins of exponentially increasing size (‘octaves’). A hump in a histogram with exponential bins does not represent a mode of primary, non-transformed abundance values, but of log transformed abundance values. A proper interpretation of the hump is given. Moreover, the extrapolation to the left of a histogram with exponential bins, below an abundance of unity, lifting a ‘veil’, hiding species present in the community but absent from the sample, is rejected. The literature is confusing at these points and, as a result, prevents a proper view on the species abundance distribution. Applying bins of equal size, modal values equalled or approached unity. The number of singletons increased with sample size in some data sets but decreased in others. However, singletons remain present in large samples, subsets or sets, in agreement with the results on modal values. The relatively high number of singletons in small samples is no artefact of undersampling. The mode at unity, that is at the left end of the species abundance distribution, independent of scale (sample, subset or set), is an important statistical property of the species abundance distribution. Our results may have implications for theory development in community ecology: the selection and/or development of an accurate species abundance model, and, connected to this, the formulation of improved assembly rules, and the selection and/or development of more precise species richness estimators.
Original languageEnglish
Pages (from-to)380-389
JournalBasic and Applied Ecology
Issue number4
Publication statusPublished - 2012


  • null hypothesis
  • long-term
  • richness
  • communities
  • dynamics
  • area

Fingerprint Dive into the research topics of 'Rarity in large data sets: Singletons, model values and the location of the species abundance distribution'. Together they form a unique fingerprint.

Cite this