Sample data for "A weakly supervised framework for high resolution crop yield forecasts"

Dataset

Description

This dataset includes sample data for the United States to run the weakly supervised framework as described in the paper titled <em>A weakly supervised framework for high resolution crop yield forecasts</em>, accessible at https://doi.org/10.48550/arXiv.2205.09016

The updated paper (including results from the US) is under review in Environmental Research Letters.

The software implementation of the machine learning baseline is available at: https://github.com/BigDataWUR/MLforCropYieldForecasting/tree/weaksup.

Data
1. County data (county-data.zip) for county-level strongly supervised models:

* CROP_AREA_COUNTY_US.csv: County crop production area statistics (acres). Source: NASS (USDA-NASS, 2022).

* CSSF_COUNTY_US.csv: Crop productivity indicators including total above-ground production (kg ha<sup>-1</sup>), total weight of storage organs (kg ha<sup>-1</sup>), development stage (0-2). Source: de Wit et al. (2022).

* METEO_COUNTY_US.csv: Meteo data including maximum, minimum, average daily air temperature (℃); sum of daily precipitation (PREC) (mm); sum of daily evapotranspiration of short vegetation (ET0) (Penman-Monteith, Allen et al., (1998)) (mm); climate water balance = (PREC - ET0) (mm). Source: Boogaard et al. (2022).

* REMOTE_SENSING_COUNTY_US.csv: Fraction of Absorbed Photosynthetically Active Radiation (Smoothed) (FAPAR). Source: Copernicus GLS (2020).

* SOIL_COUNTY_US.csv: Soil water holding capacity. Source: WISE Soil Property Database (Batjes, 2016).

* YIELD_COUNTY_US.csv: County yield statistics (bushels/acre). Source: NASS (USDA-NASS, 2022). 2. 10-km grid data (grid-data.zip) for grid-level strongly supervised models:

* COUNTY_GRIDS_US.csv: Mapping between counties and grids. * CSSF_GRIDS_US.csv: Crop productivity indicators at 10km grid level (similar to county data above).

* METEO_GRIDs_US.csv: Meteo data at 10km grid level (similar to county data above).

* REMOTE_SENSING_GRIDS_US.csv: FAPAR at 10km grid level (similar to county data above).

* SOIL_GRIDS_US.csv: Soil water holding capacity at 10km grid level (similar to county data above).

* YIELD_GRIDS_US.csv: Grid-level modeled yields (t ha<sup>-1</sup>). Source: Deines et al. (2021). 3. County labels and 10-km grid inputs (dscale-US.zip) for weak supervision:

* COUNTY_GRIDS_US.csv: Mapping between counties and grids.

* CSSF_GRIDS_US.csv: Crop productivity indicators at 10km grid level.

* METEO_GRIDs_US.csv: Meteo indicators at 10km grid level.

* REMOTE_SENSING_GRIDS_US.csv: FAPAR at 10km grid level.

* SOIL_GRIDS_US.csv: Soil water holding capacity at 10km grid level.

* YIELD_GRIDS_US.csv: Grid-level modeled yields (t ha<sup>-1</sup>). Source: Deines et al. (2021).

* YIELD_COUNTY_US.csv: County yield statistics (bushels/acre). Source: NASS (USDA-NASS, 2022).

* CROP_AREA_COUNTY_US.csv: County crop production area statistics (acres). Source: NASS (USDA-NASS, 2022).
Date made available18 May 2022
PublisherWageningen University & Research

Keywords

  • crop yield; deep learning; weak supervision; disaggregation; spatial variability

Cite this