Mixing process-based and data-driven approaches in yield prediction

Bernardo Maestrini*, Gordan Mimić, Pepijn A.J. van Oort, Keiji Jindo, Sanja Brdar, Frits K. van Evert, Ioannis Athanasiados

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

13 Citations (Scopus)


Yield prediction models can be divided between data-driven and process-based models (crop growth models). The first category contains many different types of models with parameters learned from the data themselves and where domain knowledge is only used to select the predictors and engineer features. In the second category, models are based upon biophysical principles, whose structure and parameters are derived primarily from domain knowledge. Here we investigate if the integration of the two approaches can be beneficial as it allows to overcome the limitations of the two approaches taken individually - lack of sufficiently large, reliable and orthogonal datasets for data-driven approaches and the need of many inputs for process-based models. The applications of the two categories of models have been reviewed, paying special attention to the cases where the two approaches have been mixed. By analysing the literature we identified three major cases of integration between the two approaches: (1) using crop growth models to engineer features and expand the predictors space, (2) use data-driven approaches to estimate missing inputs for process-based models (3) using data-driven approaches to produce meta-models to reduce computation burden. Finally we propose a methodology based on metamodels and transfer learning to integrate data-driven and process-based approaches.

Original languageEnglish
Article number126569
JournalEuropean Journal of Agronomy
Publication statusPublished - Sept 2022


  • Artificial intelligence
  • Crop growth models
  • Crop models
  • Data-driven
  • Dynamic crop growth models
  • Metamodels
  • Neural networks
  • Process-based
  • Surrogate models
  • Yield prediction


Dive into the research topics of 'Mixing process-based and data-driven approaches in yield prediction'. Together they form a unique fingerprint.

Cite this