Multivariate random forest for digital soil mapping

Stephan van der Westhuizen*, Gerard B.M. Heuvelink, David P. Hofmeyr

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)


In digital soil mapping (DSM), soil maps are usually produced in a univariate manner, that is, each soil map is produced independently and therefore, when multiple soil properties are mapped the underlying dependence structure between these soil properties is ignored. This may lead to inconsistent predictions and simulations. For example, soil organic carbon (SOC) and total nitrogen (TN) maps produced independently may show unrealistic carbon–nitrogen (C:N) ratios. In the last decade the production of soil maps with machine learning models has become increasingly popular as these models are able to capture complex non-linear relationships between soil properties and environmental covariates. However, producing soil maps with multivariate machine learning models is still lacking and requires much investigation in DSM. In this paper we present the combined modelling of multiple soil properties with a multivariate random forest (MRF) model. We applied this model to mapping SOC and TN, and we compared it with results of two separate univariate random forest (RF) models. The comparison was done by means of stochastic simulations determined by sampling from the conditional distributions of the soil properties, given the covariates, as estimated by quantile regression forest. The results show that the MRF model is superior in terms of maintaining the dependence structure between SOC and TN, and consequently, is also able to produce more realistic C:N ratios. The models were also compared on the basis of prediction accuracy using commonly used accuracy metrics such as the root mean square error (RMSE). We found that the accuracy of the MRF model (RMSE-SOC=40.04, RMSE-TN=2.26, RMSE-CN=3.58) is comparable to that of the univariate RF models (RMSE-SOC=39.76, RMSE-TN=2.26, RMSE-CN=3.65). We performed the same comparisons between a regression co-kriging model and two separate regression kriging models, and made similar conclusions.

Original languageEnglish
Article number116365
Number of pages11
Publication statusPublished - Mar 2023


  • C:N ratio
  • Digital soil mapping
  • Random forest
  • Regression co-kriging
  • Soil organic carbon
  • Stochastic simulation


Dive into the research topics of 'Multivariate random forest for digital soil mapping'. Together they form a unique fingerprint.

Cite this