Better interpretable models after correcting for natural variation: Residual approaches examined

Mike Koeman, Jasper Engel, Jeroen Jansen*, Lutgarde Buydens

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review


The interpretation of estimates of model parameters in terms of biological information is often just as important as the predictions of the model itself. In this study we consider the identification of metabolites in a possibly biologically heterogeneous case group that show abnormal patterns with respect to a set of (healthy) control observations. For this purpose, we filter normal (baseline) natural variation from the data by projection of the data on a control sample model: the residual approach. This step should more easily highlight the abnormal metabolites. Interpretation is, however, hindered by a problem we named the ‘residual bias’ effect, which may lead to the identification of the wrong metabolites as ‘abnormal’. This effect is related to the smearing effect. We propose to alleviate residual bias by considering a weighted average of the filtered and raw data. This way, a compromise is found between excluding irrelevant natural variation from the data and the amount of residual bias that occurs. We show for simulated and real-world examples that this compromise may outperform inspection of the raw or filtered data. The method holds promise in numerous applications such as disease diagnoses, personalized healthcare, and industrial process control.
Original languageEnglish
Pages (from-to)142-148
JournalChemometrics and Intelligent Laboratory Systems
Publication statusPublished - 15 Mar 2018


  • Disease diagnosis
  • Interpretation
  • Metabolomics
  • PCA
  • Residuals
  • Smearing

Fingerprint Dive into the research topics of 'Better interpretable models after correcting for natural variation: Residual approaches examined'. Together they form a unique fingerprint.

Cite this