TY - JOUR
T1 - Developing the Swiss mid-infrared soil spectral library for local estimation and monitoring
AU - Baumann, Philipp
AU - Helfenstein, Anatol
AU - Gubler, Andreas
AU - Keller, Armin
AU - Meuli, Reto Giulio
AU - Wächter, Daniel
AU - Lee, Juhwan
AU - Viscarra Rossel, Raphael
AU - Six, Johan
PY - 2021/8/18
Y1 - 2021/8/18
N2 - Information on soils' composition and physical, chemical and biological properties is paramount to elucidate agroecosystem functioning in space and over time. For this purpose, we developed a national Swiss soil spectral library (SSL; nCombining double low line4374) in the mid-infrared (mid-IR), calibrating 16 properties from legacy measurements on soils from the Swiss Biodiversity Monitoring program (BDM; nCombining double low line3778; 1094 sites) and the Swiss long-term Soil Monitoring Network (NABO; nCombining double low line596; 71 sites). General models were trained with the interpretable rule-based learner CUBIST, testing combinations of {5,10,20,50, and 100} ensembles of rules (committees) and {2, 5, 7, and 9} nearest neighbors used for local averaging with repeated 10-fold cross-validation grouped by location. To evaluate the information in spectra to facilitate long-term soil monitoring at a plot level, we conducted 71 model transfers for the NABO sites to induce locally relevant information from the SSL, using the data-driven sample selection method RS-LOCAL. In total, 10 soil properties were estimated with discrimination capacity suitable for screening (R2≥0.72; ratio of performance to interquartile distance (RPIQ) ≥ 2.0), out of which total carbon (C), organic C (OC), total nitrogen (N), pH and clay showed accuracy eligible for accurate diagnostics (R2>0.8; RPIQ ≥ 3.0). CUBIST and the spectra estimated total C accurately with the root mean square error (RMSE) Combining double low line 8.4 gkg-1 and the RPIQ Combining double low line 4.3, while the measured range was 1-583 gkg-1 and OC with RMSE Combining double low line 9.3 gkg-1 and RPIQ Combining double low line 3.4 (measured range 0-583 gkg-1). Compared to the general statistical learning approach, the local transfer approach - using two respective training samples - on average reduced the RMSE of total C per site fourfold. We found that the selected SSL subsets were highly dissimilar compared to validation samples, in terms of both their spectral input space and the measured values. This suggests that data-driven selection with RS-LOCAL leverages chemical diversity in composition rather than similarity. Our results suggest that mid-IR soil estimates were sufficiently accurate to support many soil applications that require a large volume of input data, such as precision agriculture, soil C accounting and monitoring and digital soil mapping. This SSL can be updated continuously, for example, with samples from deeper profiles and organic soils, so that the measurement of key soil properties becomes even more accurate and efficient in the near future.
AB - Information on soils' composition and physical, chemical and biological properties is paramount to elucidate agroecosystem functioning in space and over time. For this purpose, we developed a national Swiss soil spectral library (SSL; nCombining double low line4374) in the mid-infrared (mid-IR), calibrating 16 properties from legacy measurements on soils from the Swiss Biodiversity Monitoring program (BDM; nCombining double low line3778; 1094 sites) and the Swiss long-term Soil Monitoring Network (NABO; nCombining double low line596; 71 sites). General models were trained with the interpretable rule-based learner CUBIST, testing combinations of {5,10,20,50, and 100} ensembles of rules (committees) and {2, 5, 7, and 9} nearest neighbors used for local averaging with repeated 10-fold cross-validation grouped by location. To evaluate the information in spectra to facilitate long-term soil monitoring at a plot level, we conducted 71 model transfers for the NABO sites to induce locally relevant information from the SSL, using the data-driven sample selection method RS-LOCAL. In total, 10 soil properties were estimated with discrimination capacity suitable for screening (R2≥0.72; ratio of performance to interquartile distance (RPIQ) ≥ 2.0), out of which total carbon (C), organic C (OC), total nitrogen (N), pH and clay showed accuracy eligible for accurate diagnostics (R2>0.8; RPIQ ≥ 3.0). CUBIST and the spectra estimated total C accurately with the root mean square error (RMSE) Combining double low line 8.4 gkg-1 and the RPIQ Combining double low line 4.3, while the measured range was 1-583 gkg-1 and OC with RMSE Combining double low line 9.3 gkg-1 and RPIQ Combining double low line 3.4 (measured range 0-583 gkg-1). Compared to the general statistical learning approach, the local transfer approach - using two respective training samples - on average reduced the RMSE of total C per site fourfold. We found that the selected SSL subsets were highly dissimilar compared to validation samples, in terms of both their spectral input space and the measured values. This suggests that data-driven selection with RS-LOCAL leverages chemical diversity in composition rather than similarity. Our results suggest that mid-IR soil estimates were sufficiently accurate to support many soil applications that require a large volume of input data, such as precision agriculture, soil C accounting and monitoring and digital soil mapping. This SSL can be updated continuously, for example, with samples from deeper profiles and organic soils, so that the measurement of key soil properties becomes even more accurate and efficient in the near future.
U2 - 10.5194/soil-7-525-2021
DO - 10.5194/soil-7-525-2021
M3 - Article
AN - SCOPUS:85113783885
SN - 2199-3971
VL - 7
SP - 525
EP - 546
JO - SOIL
JF - SOIL
IS - 2
ER -