From field to airborne spectroscopy – advancing spectral data analytics for accurate retrieval of perennial ryegrass biomass and feed quality

Gustavo Togeiro de Alckmin

Research output: Thesisinternal PhD, WU


Pastures are the cornerstone of grazing-based livestock production systems, allowing for sustainable use of marginal agricultural lands, while transforming a non-digestible resource into nutritious products fit for human consumption. Despite the extensive literature about the constituent factors of pasture production and management, the majority of pasture-based grazing systems are not optimally managed due to the costly and time-consuming nature of current methods for measurement and monitoring. Although challenging, outdoor spectral observations have the potential to provide real-time information about pasture quantity and quality, serving as an ideal sensing technique for autonomous platforms, consequently, alleviating the monitoring bottleneck and supporting intensive pasture production.

The overarching aim of this thesis is to determine to which extent spectral data, in outdoor environments, can accurately estimate key biophysical and biochemical components of perennial ryegrass (Lolium perenne). In addition, this study aims to validate the transferability of spectral models from handheld spectral measurements to remotely piloted aerial systems (RPAS), while critically assessing optimal modelling approaches and minimal sensor requirements. In doing so, this thesis validates the superiority of spectral data over commonly practiced canopy-height models (CHM) for biomass estimation and suggests that suitable low-cost imaging sensing systems are within commercial reach.

After the Introduction, chapter two critically addresses the use of vegetation indices and canopy height for biomass estimation, while quantifying accuracy improvements based on different regression algorithms. As a proxy for status-quo techniques, a comparison between CHM and the normalized vegetation index (NDVI) is performed. In addition, to further explore the potential of vegetation indices, a brute-force procedure was employed to generate 11,026 normalized ratio indices (NRI) while selecting the best NRI band combination. In parallel, a pool 97 literature based vegetation indices, was filtered and underwent a feature selection procedure to determine an optimal small subset of indices. Results suggest that: (i) an optimized vegetation index (i.e., best NRI) and CHM are equivalent; (ii) a small number of vegetation indices is sufficient to reach achievable accuracy when employing top-of-canopy reflectance alone; and (iii) accuracies and precision can be improved solely through more elaborate modelling techniques, such as non-parametric methods and model stacking.

In chapter three, a genetic algorithm was employed in a two-objective search procedure: to minimize the number of spectral bands while simultaneously maximizing model accuracy for crude protein estimation. This protocol was employed over different spectral ranges, namely VIS-NIR, SWIR and the Full- Spectrum range, while comparing achievable accuracies of two different metrics: crude protein as dry matter fraction (% CP) or in a weight-per-area basis (kg CP/ha). Results suggest that, in outdoor environments, the best approach to estimate crude protein relies on its expression as weight-per-area basis and that the VIS-NIR alone can provide best accuracies in both known and unseen locations.

Chapter four presents a new approach for retrieval of a continuous spectral signature (550-790 nm) from discrete multispectral measurements (i.e., four bands, as per a commercially available multispectral camera) based on a piecewise function described by two parametric sub-functions. The retrieval of spectral signatures allowed for the generation of continuum-removed features and associated vegetation indices for prediction of biomass, previously reported in the literature as optimal indices for biomass estimation. These synthetic vegetation indices were compared against vegetation indices derived from the original band values. No significant improvement in performance was found, suggesting that underlying biological broadband absorption features (e.g. pigments, leaf-area and cell structure) were well described by both reflectance-based and continuum-removed features. Consequently, it is suggested that achievable accuracy is largely driven by an appropriate model fit between any spectral metrics of these broadband absorption features.

Chapter five employed RPAS multispectral imagery, handheld spectral-data and five distinct decision-rule regression techniques to validate the approach of biomass assessment employing a small subset of indices, while critically addressing the challenges in radiometric calibration, model interpretability, model deployment in an operational scenario, and model-performance through different validation strategies. The five regression algorithms build upon the concept of regression-trees, using techniques of bootstrapping aggregation (i.e., bagging) and boosting, consequently increasing model complexity (e.g., number of trees, depth of trees) while decreasing overall interpretability.

The RPAS multispectral was compared against handheld top-of-the canopy spectral measurements and significant inconsistencies were found between reflectance values of both sensors. Consequently, this chapter indicates the absence of well-defined and robust protocols for spectral data collection of commercial multispectral cameras When calibrated through a thorough pipeline, multispectral data was able to provide better results, although with average marginal superiority (i.e. 60kg) than handheld data.

When employing a repeated k-fold cross-validation for model associated with the multispectral imagery, two distinct algorithms (i.e., Cubist and Random-Forest) presented equivalent performances , presenting an equivalent error-metric of nearly 400 kg DM/ha. However, the three remaining (i.e., bagged, boosted trees and CART) algorithms had an average performance of 450 kg DM/ha. Both CART and Cubist are considered more interpretable, operating under a single regression tree structure, thus, presenting a faster prediction speed, smaller size and the ability to scrutinize models. However, a temporal validation strategy showed a low reliability of spectral models when validated outside its boundary conditions, with associated errors above 800 kg DM/ha, rendering predictions not useful and showcasing the shortcomings of performance claims of short-duration studies.

In its Outlook, special emphasis is directed to maximum achievable accuracies through the use of spectral data in outdoor environments, under the presence of confounding and masking effects derived from canopy geometry and illumination conditions while stressing the need for rigorous protocols and quality assurance mechanisms for spectral data collection. Finally, this thesis identifies the main bottlenecks for the advancement of spectral imaging techniques in a farm-operational scenario, indicating possible advancement through the use of minimal sensing equipment, automated data-collection, faster and interpretable modelling techniques.

Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
  • Kooistra, Lammert, Promotor
  • Lucieer, A., Co-promotor, External person
  • Rawnsley, R., Co-promotor, External person
Award date19 May 2021
Place of PublicationWageningen
Print ISBNs9789463957663
Publication statusPublished - 2021


Dive into the research topics of 'From field to airborne spectroscopy – advancing spectral data analytics for accurate retrieval of perennial ryegrass biomass and feed quality'. Together they form a unique fingerprint.

Cite this