Accounting for differences in costs among sampling locations in optimal stratification

D.J. Brus*, L. Yang, A.X. Zhu

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)


In areas with marked differences in accessibility, the cost efficiency of design-based sampling strategies for estimating the population mean or total can be increased by accounting for these differences in the selection of the sampling locations. This can be achieved by stratified random sampling. The question then is how to construct the strata. Existing optimal stratification methods such as cum (Formula presented.) stratification assume a constant cost among the sampling units, and therefore can be suboptimal when this assumption is violated. A simulated annealing algorithm is proposed for simultaneous optimization of the stratum breaks and the sample size under optimal allocation of the sample size, given a chosen maximum for the expected total costs. The proposed stratification method is tested in a study area of 5900 km2 in Anhui province, China. Optimal stratum breaks were computed for estimating the population mean of the soil organic matter content (SOM). Predictions of SOM from a multiple linear regression model were used as a stratification variable. The optimal stratum breaks differed markedly from the cum (Formula presented.) breaks. The variance of the estimated mean of SOM using the optimal stratification was about 8 to 29% smaller than with the cum (Formula presented.) stratification, depending on the number of strata. This large gain in precision can be explained by the moderately strong correlation of the point-wise costs and the stratification variable. Smaller gains are expected when this correlation is weaker or the variation in costs among the units are smaller. The proposed algorithm can also be used when no ancillary variable related to the variable of interest is available, accounting for differences in costs among the sampling units only. An R script with functions is provided as supporting information. Highlights: A method is proposed to compute optimal strata that accounts for differences in costs among sampling locations Simulated annealing is used to optimize stratum breaks and total sample size under a total costs constraint The variance of estimated mean of SOM with proposed method was 8 to 29% smaller than with cum (Formula presented.) method Proposed algorithm can also be used when no stratification variable is available (optimal costs stratification).

Original languageEnglish
Article number12731
Pages (from-to)200-212
JournalEuropean Journal of Soil Science
Issue number1
Early online date9 Sep 2018
Publication statusPublished - Jan 2019


Dive into the research topics of 'Accounting for differences in costs among sampling locations in optimal stratification'. Together they form a unique fingerprint.

Cite this