Description
A new method was developed to classify sepal sensitivity to fungal infections of recently harvested tomatoes using spectral imaging and PLSDA. Previous work has been done by Brdar, S et al. (2021) and De Villiers, HAC et al. (2023), where the influence of variables were determined in the final model. However, in the present work an iterative process is used to select a sparse subset of important variables before their use by the final model. In this way, only a small subset of wavelengths needs to be measured in the unseen samples.
32 ‘Cappricia’ tomatoes without any visible indications of fungal infection were imaged in two separate equally sized groups. Hyperspectral images were recorded on day one using a Specim FX17 NIR linescan camera. Subsequently, tomatoes were stored in controlled conditions encouraging fungal growth (20°C, in a closed box reaching 100% Relative Humidity, in a room at 60% RH, lights on during 7:00-19:00h, 15 μmol·s-1·m-2).
Ground truth observations were made by experts on day three and four, comprised of severity scores from zero (no fungus) to three (severe infection). Ratings of the two days were averaged. Firstly, outliers were removed in each tomato, by PCA. The remaining pixels belonging to the same sepal were averaged giving rise to 167 rows of sepals.
Samples were distributed in two classes according to visual scoring. Class 1 (negative) included ratings of 0.5 or less. Class 2 (positive) included ratings of 1 or greater. The data set was then divided into calibration (70%) and validation (30%) sets, randomly, by tomato. Besides raw data, several preprocessing steps were performed (Figure 1). Models were built in the training set using 11 to 40 selected variables by CovSel. PLSDA latent variables were optimized as well, by cross-validation on each tomato. Figure 1 shows results of different models and the pretreatments used. In all of them the optimal number of variables was also optimized.
The important variables found in this work are (nm): 937, 944, 951, 971, 1089, 1152, 1306, 1356, 1391, 1440, 1540, 1675, 1704, 1711, 1718. The best results were obtained using raw data, the mentioned features, and 3 latent variables in PLSDA. The model presented high accuracy of validation, 0.80. Sensitivity and specificity were 0.62 and 0.91 respectively for class 1. Thus, the model presented potential as a fast alternative method to classify recently harvested tomatoes before the fungal infection is visually observed.
Period | 20 Aug 2023 → 24 Aug 2023 |
---|---|
Event title | International Conference of Near Infrared Spectroscopy |
Event type | Conference/symposium |
Degree of Recognition | International |