Approximating the variance of estimated means for systematic random sampling, illustrated with data of the French Soil Monitoring Network

D.J. Brus, N.P.A. Saby

Research output: Contribution to journalArticleAcademicpeer-review

9 Citations (Scopus)

Abstract

In France like in many other countries, the soil is monitored at the locations of a regular, square grid thus forming a systematic sample (SY). This sampling design leads to good spatial coverage, enhancing the precision of design-based estimates of spatial means and totals. Design-based estimation of the mean or total from SY samples is straightforward. However, an unbiased estimator of the sampling variance of the estimated mean or total does not exist. This paper compares five variance approximations, being the simple random (SI), stratified simple random (STSI), Geary's spatial autocorrelation C index (Geary), Moran's I index (Moran), and the model-based (MB) approximation in a simulation study and a real-world case study. In a simulation study the model distribution of the conditional bias (conditioned on a simulated reality) of the variance approximations is estimated for various variograms and two sample sizes. In the case study the data of the first campaign of the French Soil Monitoring Network are used to estimate the spatial means of six soil variables (C, Tl, Cd, Ni, K, Mn) for aggregated soil map units of France, and to approximate their sampling variances. The bias in the approximated variances is explored with MODIS-NDVI data. With variograms with no or a small relative nugget variance approximation STSI and MB are the best choices. In situations with large relative nugget STSI is to be preferred over MB as MB then may somewhat underestimate the variance. Moran and SI should be avoided as approximation methods, as they seriously underestimate (Moran) and overestimate (SI) the variance in many cases. The approximated standard error of total soil organic carbon stock in France as obtained with MB was 0.0335 Pg, which was small compared to the estimated stock of 3.580 Pg.

LanguageEnglish
Pages77-86
JournalGeoderma
Volume279
DOIs
Publication statusPublished - 2016

Fingerprint

monitoring
sampling
soil
variogram
France
case studies
NDVI
autocorrelation
MODIS
simulation
monitoring network
moderate resolution imaging spectroradiometer
organic carbon
carbon sinks
soil organic carbon
index
methodology

Keywords

  • Carbon stock
  • Design-based inference
  • Moran's IGeary's spatial autocorrelation index
  • Variogram

Cite this

@article{3549e68923364e9b90a4598722fdadf9,
title = "Approximating the variance of estimated means for systematic random sampling, illustrated with data of the French Soil Monitoring Network",
abstract = "In France like in many other countries, the soil is monitored at the locations of a regular, square grid thus forming a systematic sample (SY). This sampling design leads to good spatial coverage, enhancing the precision of design-based estimates of spatial means and totals. Design-based estimation of the mean or total from SY samples is straightforward. However, an unbiased estimator of the sampling variance of the estimated mean or total does not exist. This paper compares five variance approximations, being the simple random (SI), stratified simple random (STSI), Geary's spatial autocorrelation C index (Geary), Moran's I index (Moran), and the model-based (MB) approximation in a simulation study and a real-world case study. In a simulation study the model distribution of the conditional bias (conditioned on a simulated reality) of the variance approximations is estimated for various variograms and two sample sizes. In the case study the data of the first campaign of the French Soil Monitoring Network are used to estimate the spatial means of six soil variables (C, Tl, Cd, Ni, K, Mn) for aggregated soil map units of France, and to approximate their sampling variances. The bias in the approximated variances is explored with MODIS-NDVI data. With variograms with no or a small relative nugget variance approximation STSI and MB are the best choices. In situations with large relative nugget STSI is to be preferred over MB as MB then may somewhat underestimate the variance. Moran and SI should be avoided as approximation methods, as they seriously underestimate (Moran) and overestimate (SI) the variance in many cases. The approximated standard error of total soil organic carbon stock in France as obtained with MB was 0.0335 Pg, which was small compared to the estimated stock of 3.580 Pg.",
keywords = "Carbon stock, Design-based inference, Moran's IGeary's spatial autocorrelation index, Variogram",
author = "D.J. Brus and N.P.A. Saby",
year = "2016",
doi = "10.1016/j.geoderma.2016.05.016",
language = "English",
volume = "279",
pages = "77--86",
journal = "Geoderma",
issn = "0016-7061",
publisher = "Elsevier",

}

Approximating the variance of estimated means for systematic random sampling, illustrated with data of the French Soil Monitoring Network. / Brus, D.J.; Saby, N.P.A.

In: Geoderma, Vol. 279, 2016, p. 77-86.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Approximating the variance of estimated means for systematic random sampling, illustrated with data of the French Soil Monitoring Network

AU - Brus, D.J.

AU - Saby, N.P.A.

PY - 2016

Y1 - 2016

N2 - In France like in many other countries, the soil is monitored at the locations of a regular, square grid thus forming a systematic sample (SY). This sampling design leads to good spatial coverage, enhancing the precision of design-based estimates of spatial means and totals. Design-based estimation of the mean or total from SY samples is straightforward. However, an unbiased estimator of the sampling variance of the estimated mean or total does not exist. This paper compares five variance approximations, being the simple random (SI), stratified simple random (STSI), Geary's spatial autocorrelation C index (Geary), Moran's I index (Moran), and the model-based (MB) approximation in a simulation study and a real-world case study. In a simulation study the model distribution of the conditional bias (conditioned on a simulated reality) of the variance approximations is estimated for various variograms and two sample sizes. In the case study the data of the first campaign of the French Soil Monitoring Network are used to estimate the spatial means of six soil variables (C, Tl, Cd, Ni, K, Mn) for aggregated soil map units of France, and to approximate their sampling variances. The bias in the approximated variances is explored with MODIS-NDVI data. With variograms with no or a small relative nugget variance approximation STSI and MB are the best choices. In situations with large relative nugget STSI is to be preferred over MB as MB then may somewhat underestimate the variance. Moran and SI should be avoided as approximation methods, as they seriously underestimate (Moran) and overestimate (SI) the variance in many cases. The approximated standard error of total soil organic carbon stock in France as obtained with MB was 0.0335 Pg, which was small compared to the estimated stock of 3.580 Pg.

AB - In France like in many other countries, the soil is monitored at the locations of a regular, square grid thus forming a systematic sample (SY). This sampling design leads to good spatial coverage, enhancing the precision of design-based estimates of spatial means and totals. Design-based estimation of the mean or total from SY samples is straightforward. However, an unbiased estimator of the sampling variance of the estimated mean or total does not exist. This paper compares five variance approximations, being the simple random (SI), stratified simple random (STSI), Geary's spatial autocorrelation C index (Geary), Moran's I index (Moran), and the model-based (MB) approximation in a simulation study and a real-world case study. In a simulation study the model distribution of the conditional bias (conditioned on a simulated reality) of the variance approximations is estimated for various variograms and two sample sizes. In the case study the data of the first campaign of the French Soil Monitoring Network are used to estimate the spatial means of six soil variables (C, Tl, Cd, Ni, K, Mn) for aggregated soil map units of France, and to approximate their sampling variances. The bias in the approximated variances is explored with MODIS-NDVI data. With variograms with no or a small relative nugget variance approximation STSI and MB are the best choices. In situations with large relative nugget STSI is to be preferred over MB as MB then may somewhat underestimate the variance. Moran and SI should be avoided as approximation methods, as they seriously underestimate (Moran) and overestimate (SI) the variance in many cases. The approximated standard error of total soil organic carbon stock in France as obtained with MB was 0.0335 Pg, which was small compared to the estimated stock of 3.580 Pg.

KW - Carbon stock

KW - Design-based inference

KW - Moran's IGeary's spatial autocorrelation index

KW - Variogram

U2 - 10.1016/j.geoderma.2016.05.016

DO - 10.1016/j.geoderma.2016.05.016

M3 - Article

VL - 279

SP - 77

EP - 86

JO - Geoderma

T2 - Geoderma

JF - Geoderma

SN - 0016-7061

ER -