In large heterogeneous areas the relationship between soil organic carbon (SOC) and environmental covariates may vary throughout the area, bringing about difficulty for accurate modeling of the regional SOC variation. The benefit of local, geographically weighted regression (GWR) coefficients was tested in a case study on soil organic carbon mapping across a 50,810km2 area in northwestern China. This area is composed of an alpine ecosystem in the upper reaches and oases in the middle reaches. The benefit was quantified by comparing the quality of the maps obtained by GWR and geographically weighted ridge regression (GWRR) on the one side and multiple linear regression (MLR) on the other side. In these methods spatial dependence of model residuals is ignored. The root mean squared error (RMSE) of predictions of natural log-transformed SOC obtained with GWR was smaller than with MLR: 0.565 versus 0.618g/kg. The use of a local ridge parameter in GWRR did not lead to an increase in accuracy. Besides we compared the quality of maps obtained by geographically weighted regression followed by simple kriging of model residuals (GWRSK) and kriging with an external drift (KED) with global regression coefficients. In these methods the spatial dependence of model residuals is incorporated in the model. The RMSE with KED was smaller than with GWRSK: 0.515 versus 0.546g/kg. We conclude that fitting regression coefficients locally as in GWR only paid when no spatial random effect was included in the model. When a spatial random effect was included, the flexibility of local, geographically weighted regression coefficients was not needed and even undesirable as it led to less accurate predictions than KED with global regression coefficients. In comparing the accuracy of prediction methods by leave-one-out cross-validation (LOOCV) of a non-probability sample it is important to account for possible autocorrelation of pairwise differences in the prediction errors. The effective sample sizes were substantially smaller than the total number of sampling points, so that most pairwise differences in MSE were not significant at a significance level of 10% in a two-sided paired t-test.
- Digital soil mapping
- Effective sample size
- Kriging with an external drift (KED)
- Map accuracy
- Restricted maximum likelihood (REML)