### Abstract

Proportional data, in which response variables are expressed as percentages or fractions of a whole, are analysed in many subfields of ecology and evolution. The scale-independence of proportions makes them appropriate to analyse many biological phenomena, but statistical analyses are not straightforward, since proportions can only take values from zero to one and their variance is usually not constant across the range of the predictor. Transformations to overcome these problems are often applied, but can lead to biased estimates and difficulties in interpretation. In this paper, we provide an overview of the different types of proportional data and discuss the different analysis strategies available. In particular, we review and discuss the use of promising, but little used, techniques for analysing continuous (also called non-count-based or non-binomial) proportions (e.g. percent cover, fraction time spent on an activity): beta and Dirichlet regression, and some of their most important extensions. A major distinction can be made between proportions arising from counts and those arising from continuous measurements. For proportions consisting of two categories, count-based data are best analysed using well-developed techniques such as logistic regression, while continuous proportions can be analysed with beta regression models. In the case of >2 categories, multinomial logistic regression or Dirichlet regression can be applied. Both beta and Dirichlet regression techniques model proportions at their original scale, which makes statistical inference more straightforward and produce less biased estimates relative to transformation-based solutions. Extensions to beta regression, such as models for variable dispersion, zero-one augmented data and mixed effects designs have been developed and are reviewed and applied to case studies. Finally, we briefly discuss some issues regarding model fitting, inference, and reporting that are particularly relevant to beta and Dirichlet regression. Beta regression and Dirichlet regression overcome some problems inherent in applying classic statistical approaches to proportional data. To facilitate the adoption of these techniques by practitioners in ecology and evolution, we present detailed, annotated demonstration scripts covering all variations of beta and Dirichlet regression discussed in the article, implemented in the freely available language for statistical computing, r.

Original language | English |
---|---|

Pages (from-to) | 1412-1430 |

Number of pages | 19 |

Journal | Methods in Ecology and Evolution |

Volume | 10 |

Issue number | 9 |

Early online date | 6 Jun 2019 |

DOIs | |

Publication status | Published - Sep 2019 |

### Fingerprint

### Keywords

- beta regression
- Dirichlet regression
- fractions
- non-count proportions
- one augmented
- proportions
- transformations
- zero augmented

### Cite this

}

*Methods in Ecology and Evolution*, vol. 10, no. 9, pp. 1412-1430. https://doi.org/10.1111/2041-210X.13234

**Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression.** / Douma, Jacob C.; Weedon, James T.

Research output: Contribution to journal › Article › Academic › peer-review

TY - JOUR

T1 - Analysing continuous proportions in ecology and evolution: A practical introduction to beta and Dirichlet regression

AU - Douma, Jacob C.

AU - Weedon, James T.

PY - 2019/9

Y1 - 2019/9

N2 - Proportional data, in which response variables are expressed as percentages or fractions of a whole, are analysed in many subfields of ecology and evolution. The scale-independence of proportions makes them appropriate to analyse many biological phenomena, but statistical analyses are not straightforward, since proportions can only take values from zero to one and their variance is usually not constant across the range of the predictor. Transformations to overcome these problems are often applied, but can lead to biased estimates and difficulties in interpretation. In this paper, we provide an overview of the different types of proportional data and discuss the different analysis strategies available. In particular, we review and discuss the use of promising, but little used, techniques for analysing continuous (also called non-count-based or non-binomial) proportions (e.g. percent cover, fraction time spent on an activity): beta and Dirichlet regression, and some of their most important extensions. A major distinction can be made between proportions arising from counts and those arising from continuous measurements. For proportions consisting of two categories, count-based data are best analysed using well-developed techniques such as logistic regression, while continuous proportions can be analysed with beta regression models. In the case of >2 categories, multinomial logistic regression or Dirichlet regression can be applied. Both beta and Dirichlet regression techniques model proportions at their original scale, which makes statistical inference more straightforward and produce less biased estimates relative to transformation-based solutions. Extensions to beta regression, such as models for variable dispersion, zero-one augmented data and mixed effects designs have been developed and are reviewed and applied to case studies. Finally, we briefly discuss some issues regarding model fitting, inference, and reporting that are particularly relevant to beta and Dirichlet regression. Beta regression and Dirichlet regression overcome some problems inherent in applying classic statistical approaches to proportional data. To facilitate the adoption of these techniques by practitioners in ecology and evolution, we present detailed, annotated demonstration scripts covering all variations of beta and Dirichlet regression discussed in the article, implemented in the freely available language for statistical computing, r.

AB - Proportional data, in which response variables are expressed as percentages or fractions of a whole, are analysed in many subfields of ecology and evolution. The scale-independence of proportions makes them appropriate to analyse many biological phenomena, but statistical analyses are not straightforward, since proportions can only take values from zero to one and their variance is usually not constant across the range of the predictor. Transformations to overcome these problems are often applied, but can lead to biased estimates and difficulties in interpretation. In this paper, we provide an overview of the different types of proportional data and discuss the different analysis strategies available. In particular, we review and discuss the use of promising, but little used, techniques for analysing continuous (also called non-count-based or non-binomial) proportions (e.g. percent cover, fraction time spent on an activity): beta and Dirichlet regression, and some of their most important extensions. A major distinction can be made between proportions arising from counts and those arising from continuous measurements. For proportions consisting of two categories, count-based data are best analysed using well-developed techniques such as logistic regression, while continuous proportions can be analysed with beta regression models. In the case of >2 categories, multinomial logistic regression or Dirichlet regression can be applied. Both beta and Dirichlet regression techniques model proportions at their original scale, which makes statistical inference more straightforward and produce less biased estimates relative to transformation-based solutions. Extensions to beta regression, such as models for variable dispersion, zero-one augmented data and mixed effects designs have been developed and are reviewed and applied to case studies. Finally, we briefly discuss some issues regarding model fitting, inference, and reporting that are particularly relevant to beta and Dirichlet regression. Beta regression and Dirichlet regression overcome some problems inherent in applying classic statistical approaches to proportional data. To facilitate the adoption of these techniques by practitioners in ecology and evolution, we present detailed, annotated demonstration scripts covering all variations of beta and Dirichlet regression discussed in the article, implemented in the freely available language for statistical computing, r.

KW - beta regression

KW - Dirichlet regression

KW - fractions

KW - non-count proportions

KW - one augmented

KW - proportions

KW - transformations

KW - zero augmented

U2 - 10.1111/2041-210X.13234

DO - 10.1111/2041-210X.13234

M3 - Article

VL - 10

SP - 1412

EP - 1430

JO - Methods in Ecology and Evolution

JF - Methods in Ecology and Evolution

SN - 2041-210X

IS - 9

ER -