Occupations on the map: Using a super learner algorithm to downscale labor statistics, data



This repository contains all the input and output data (including maps) related to Van Dijk et al. (2022), Occupations on the map: Using a super learner algorithm to downscale labor statistics. It does not contain several large (> 4GB) intermediate files, which summarize the results of the large number of machine learning models that were trained and tuned as part of the super learner algorithm. These files can be created by running the scripts in the supplementary GitHub repository: https://github.com/michielvandijk/occupations_on_the_map. All input and output maps produced as part of this study can also be accessed by means of an interactive web application: https://shiny.wur.nl/occupation-map-vnm. In this paper, we demonstrated an approach to create fine-scale gridded occupation maps by means of downscaling district-level labor statistics informed by remote sensing and other spatial information. We applied a super-learner algorithm that combined the results of different machine learning models to predict the shares of six major occupation categories and the labor force participation rate at a resolution of 30 arc seconds (~1x1 km) in Vietnam. The results were subsequently combined with gridded information on the working-age population to produce maps of the number of workers per occupation. The proposed approach can also be applied to produce maps of other (labor) statistics, which are only available at aggregated levels.
Date made available6 Apr 2022
PublisherWageningen University & Research

Cite this