Multimodel ensembles improve predictions of crop–environment–management interactions

Daniel Wallach*, Pierre Martre, Bing Liu, Senthold Asseng, Frank Ewert, Peter J. Thorburn, Martin van Ittersum, Pramod K. Aggarwal, Mukhtar Ahmed, Bruno Basso, Christian Biernath, Davide Cammarano, Andrew J. Challinor, Giacomo De Sanctis, Benjamin Dumont, Ehsan Eyshi Rezaei, Elias Fereres, Glenn J. Fitzgerald, Y. Gao, Margarita Garcia-VilaSebastian Gayler, Christine Girousse, Gerrit Hoogenboom, Heidi Horan, Roberto C. Izaurralde, Curtis D. Jones, Belay T. Kassie, Christian C. Kersebaum, Christian Klein, Ann Kristin Koehler, Andrea Maiorano, Sara Minoli, Christoph Müller, Soora Naresh Kumar, Claas Nendel, Garry J. O'Leary, Taru Palosuo, Eckart Priesack, Dominique Ripoche, Reimund P. Rötter, Mikhail A. Semenov, Claudio Stöckle, Pierre Stratonovitch, Thilo Streck, Iwan Supit, Fulu Tao, Joost Wolf, Zhao Zhang

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

111 Citations (Scopus)


A recent innovation in assessment of climate change impact on agricultural production has been to use crop multimodel ensembles (MMEs). These studies usually find large variability between individual models but that the ensemble mean (e-mean) and median (e-median) often seem to predict quite well. However, few studies have specifically been concerned with the predictive quality of those ensemble predictors. We ask what is the predictive quality of e-mean and e-median, and how does that depend on the ensemble characteristics. Our empirical results are based on five MME studies applied to wheat, using different data sets but the same 25 crop models. We show that the ensemble predictors have quite high skill and are better than most and sometimes all individual models for most groups of environments and most response variables. Mean squared error of e-mean decreases monotonically with the size of the ensemble if models are added at random, but has a minimum at usually 2–6 models if best-fit models are added first. Our theoretical results describe the ensemble using four parameters: average bias, model effect variance, environment effect variance, and interaction variance. We show analytically that mean squared error of prediction (MSEP) of e-mean will always be smaller than MSEP averaged over models and will be less than MSEP of the best model if squared bias is less than the interaction variance. If models are added to the ensemble at random, MSEP of e-mean will decrease as the inverse of ensemble size, with a minimum equal to squared bias plus interaction variance. This minimum value is not necessarily small, and so it is important to evaluate the predictive quality of e-mean for each target population of environments. These results provide new information on the advantages of ensemble predictors, but also show their limitations.

Original languageEnglish
Pages (from-to)5072-5083
JournalGlobal Change Biology
Issue number11
Early online date28 Jul 2018
Publication statusPublished - Nov 2018


  • climate change impact
  • crop models
  • ensemble mean
  • ensemble median
  • multimodel ensemble
  • prediction


Dive into the research topics of 'Multimodel ensembles improve predictions of crop–environment–management interactions'. Together they form a unique fingerprint.

Cite this