Constructing models can be complicated when the available fitting data are highly correlated and of high dimension. However, the complications depend on whether the goal is prediction instead of estimation. We focus on predicting tree mortality (measured as the number of dead trees) from change metrics derived from moderate-resolution imaging spectroradiometer satellite images. The high dimensionality and multicollinearity inherent in such data are of particular concern. Standard regression techniques perform poorly for such data, so we examine shrinkage regression techniques such as ridge regression, the LASSO, and partial least squares, which yield more robust predictions. We also suggest efficient strategies that can be used to select optimal models such as 0.632+ bootstrap and generalized cross validation. The techniques are compared using simulations. The techniques are then used to predict insect-induced tree mortality severity for a Pinus radiata D. Don plantation in southern New South Wales, Australia, and their prediction performances are compared. We find that shrinkage regression techniques outperform the standard methods, with ridge regression and the LASSO performing particularly well.
- nonorthogonal problems
- hyperspectral data
- ridge regression
Lazaridis, D. C., Verbesselt, J., & Robinson, A. P. (2011). Penalized regression techniques for prediction: a case study for predicting tree mortality using remotely sensed vegetation indices. Canadian Journal of Forest Research, 41(1), 24-34. https://doi.org/10.1139/X10-180