Sampling for digital soil mapping: A tutorial supported by R scripts

Research output: Contribution to journalArticleAcademicpeer-review

5 Citations (Scopus)

Abstract

In the past decade, substantial progress has been made in model-based optimization of sampling designs for mapping. This paper is an update of the overview of sampling designs for mapping presented by de Gruijter et al. (2006). For model-based estimation of values at unobserved points (mapping), probability sampling is not required, which opens up the possibility of optimized non-probability sampling. Non-probability sampling designs for mapping are regular grid sampling, spatial coverage sampling, k-means sampling, conditioned Latin hypercube sampling, response surface sampling, Kennard-Stone sampling and model-based sampling. In model-based sampling a preliminary model of the spatial variation of the soil variable of interest is used for optimizing the sample size and or the spatial coordinates of the sampling locations. Kriging requires knowledge of the variogram. Sampling designs for variogram estimation are nested sampling, independent random sampling of pairs of points, and model-based designs in which either the uncertainty about the variogram parameters, or the uncertainty about the kriging variance is minimized. Various minimization criteria have been proposed for designing a single sample that is suitable both for estimating the variogram and for mapping. For map validation, additional probability sampling is recommended, so that unbiased estimates of map quality indices and their standard errors can be obtained. For all sampling designs, R scripts are available in the supplement. Further research is recommended on sampling designs for mapping with machine learning techniques, designs that are robust against deviations of modeling assumptions, designs tailored at mapping multiple soil variables of interest and soil classes or fuzzy memberships, and probability sampling designs that are efficient both for design-based estimation of populations means and for model-based mapping.

LanguageEnglish
Pages464-480
JournalGeoderma
Volume338
Early online date19 Aug 2018
DOIs
Publication statusPublished - 2019

Fingerprint

soil surveys
sampling
soil
variogram
kriging
uncertainty

Keywords

  • K-means sampling
  • Kriging
  • Latin hypercube sampling
  • Model-based sampling
  • Spatial coverage sampling
  • Spatial simulated annealing
  • Variogram

Cite this

@article{41b15200afa942dda3c44a08f2b5d58c,
title = "Sampling for digital soil mapping: A tutorial supported by R scripts",
abstract = "In the past decade, substantial progress has been made in model-based optimization of sampling designs for mapping. This paper is an update of the overview of sampling designs for mapping presented by de Gruijter et al. (2006). For model-based estimation of values at unobserved points (mapping), probability sampling is not required, which opens up the possibility of optimized non-probability sampling. Non-probability sampling designs for mapping are regular grid sampling, spatial coverage sampling, k-means sampling, conditioned Latin hypercube sampling, response surface sampling, Kennard-Stone sampling and model-based sampling. In model-based sampling a preliminary model of the spatial variation of the soil variable of interest is used for optimizing the sample size and or the spatial coordinates of the sampling locations. Kriging requires knowledge of the variogram. Sampling designs for variogram estimation are nested sampling, independent random sampling of pairs of points, and model-based designs in which either the uncertainty about the variogram parameters, or the uncertainty about the kriging variance is minimized. Various minimization criteria have been proposed for designing a single sample that is suitable both for estimating the variogram and for mapping. For map validation, additional probability sampling is recommended, so that unbiased estimates of map quality indices and their standard errors can be obtained. For all sampling designs, R scripts are available in the supplement. Further research is recommended on sampling designs for mapping with machine learning techniques, designs that are robust against deviations of modeling assumptions, designs tailored at mapping multiple soil variables of interest and soil classes or fuzzy memberships, and probability sampling designs that are efficient both for design-based estimation of populations means and for model-based mapping.",
keywords = "K-means sampling, Kriging, Latin hypercube sampling, Model-based sampling, Spatial coverage sampling, Spatial simulated annealing, Variogram",
author = "D.J. Brus",
year = "2019",
doi = "10.1016/j.geoderma.2018.07.036",
language = "English",
volume = "338",
pages = "464--480",
journal = "Geoderma",
issn = "0016-7061",
publisher = "Elsevier",

}

Sampling for digital soil mapping : A tutorial supported by R scripts. / Brus, D.J.

In: Geoderma, Vol. 338, 2019, p. 464-480.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Sampling for digital soil mapping

T2 - Geoderma

AU - Brus, D.J.

PY - 2019

Y1 - 2019

N2 - In the past decade, substantial progress has been made in model-based optimization of sampling designs for mapping. This paper is an update of the overview of sampling designs for mapping presented by de Gruijter et al. (2006). For model-based estimation of values at unobserved points (mapping), probability sampling is not required, which opens up the possibility of optimized non-probability sampling. Non-probability sampling designs for mapping are regular grid sampling, spatial coverage sampling, k-means sampling, conditioned Latin hypercube sampling, response surface sampling, Kennard-Stone sampling and model-based sampling. In model-based sampling a preliminary model of the spatial variation of the soil variable of interest is used for optimizing the sample size and or the spatial coordinates of the sampling locations. Kriging requires knowledge of the variogram. Sampling designs for variogram estimation are nested sampling, independent random sampling of pairs of points, and model-based designs in which either the uncertainty about the variogram parameters, or the uncertainty about the kriging variance is minimized. Various minimization criteria have been proposed for designing a single sample that is suitable both for estimating the variogram and for mapping. For map validation, additional probability sampling is recommended, so that unbiased estimates of map quality indices and their standard errors can be obtained. For all sampling designs, R scripts are available in the supplement. Further research is recommended on sampling designs for mapping with machine learning techniques, designs that are robust against deviations of modeling assumptions, designs tailored at mapping multiple soil variables of interest and soil classes or fuzzy memberships, and probability sampling designs that are efficient both for design-based estimation of populations means and for model-based mapping.

AB - In the past decade, substantial progress has been made in model-based optimization of sampling designs for mapping. This paper is an update of the overview of sampling designs for mapping presented by de Gruijter et al. (2006). For model-based estimation of values at unobserved points (mapping), probability sampling is not required, which opens up the possibility of optimized non-probability sampling. Non-probability sampling designs for mapping are regular grid sampling, spatial coverage sampling, k-means sampling, conditioned Latin hypercube sampling, response surface sampling, Kennard-Stone sampling and model-based sampling. In model-based sampling a preliminary model of the spatial variation of the soil variable of interest is used for optimizing the sample size and or the spatial coordinates of the sampling locations. Kriging requires knowledge of the variogram. Sampling designs for variogram estimation are nested sampling, independent random sampling of pairs of points, and model-based designs in which either the uncertainty about the variogram parameters, or the uncertainty about the kriging variance is minimized. Various minimization criteria have been proposed for designing a single sample that is suitable both for estimating the variogram and for mapping. For map validation, additional probability sampling is recommended, so that unbiased estimates of map quality indices and their standard errors can be obtained. For all sampling designs, R scripts are available in the supplement. Further research is recommended on sampling designs for mapping with machine learning techniques, designs that are robust against deviations of modeling assumptions, designs tailored at mapping multiple soil variables of interest and soil classes or fuzzy memberships, and probability sampling designs that are efficient both for design-based estimation of populations means and for model-based mapping.

KW - K-means sampling

KW - Kriging

KW - Latin hypercube sampling

KW - Model-based sampling

KW - Spatial coverage sampling

KW - Spatial simulated annealing

KW - Variogram

U2 - 10.1016/j.geoderma.2018.07.036

DO - 10.1016/j.geoderma.2018.07.036

M3 - Article

VL - 338

SP - 464

EP - 480

JO - Geoderma

JF - Geoderma

SN - 0016-7061

ER -