Sampling for validation of digital soil maps

Research output: Contribution to journalArticleAcademicpeer-review

245 Citations (Scopus)


The increase in digital soil mapping around the world means that appropriate and efficient sampling strategies are needed for validation. Data used for calibrating a digital soil mapping model typically are non-random samples. In such a case we recommend collection of additional independent data and validation of the soil map by a design-based sampling strategy involving probability sampling and design-based estimation of quality measures. An important advantage over validation by data-splitting or cross-validation is that model-free estimates of the quality measures and their standard errors can be obtained, and thus no assumptions on the spatial auto-correlation of prediction errors need to be made. The quality of quantitative soil maps can be quantified by the spatial cumulative distribution function (SCDF) of the prediction errors, whereas for categorical soil maps the overall purity and the map unit purities (user's accuracies) and soil class representation (producer's accuracies) are suitable quality measures. The suitability of five basic types of random sampling design for soil map validation was evaluated: simple, stratified simple, systematic, cluster and two-stage random sampling. Stratified simple random sampling is generally a good choice: it is simple to implement, estimation of the quality measures and their precision is straightforward, it gives relatively precise estimates, and no assumptions are needed in quantifying the standard error of the estimated quality measures. Validation by probability sampling is illustrated with two case studies. A categorical soil map on point support depicting soil classes in the province of Drenthe of the Netherlands (268 000 ha) was validated by stratified simple random sampling. Sub-areas with different expected purities were used as strata. The estimated overall purity was 58% with a standard error of 4%. This was 9% smaller than the theoretical purity computed with the model. Map unit purities and class representations were estimated by the ratio estimator. A quantitative soil map, depicting the average soil organic carbon (SOC) contents of pixels in an area of 81 600 ha in Senegal, was validated by random transect sampling. SOC predictions were seriously biased, and the random error was considerable. Both case studies underpin the importance of independent validation of soil maps by probability sampling, to avoid unfounded trust in visually attractive maps produced by advanced pedometric techniques
Original languageEnglish
Pages (from-to)394-407
JournalEuropean Journal of Soil Science
Issue number3
Publication statusPublished - 2011


  • remotely-sensed data
  • accuracy assessment
  • classification accuracy
  • spatial interpolation
  • discriminant-analysis
  • design
  • model
  • quality
  • statistics
  • inference


Dive into the research topics of 'Sampling for validation of digital soil maps'. Together they form a unique fingerprint.

Cite this