Penalized regression techniques for modeling relationships between metabolites and tomato taste attributes

P. Menendez, P. Eilers, Y.M. Tikunov, A.G. Bovy, F. van Eeuwijk

Research output: Contribution to journalArticleAcademicpeer-review

3 Citations (Scopus)

Abstract

The search for models which link tomato taste attributes to their metabolic profiling, is a main challenge within the breeding programs that aim to enhance tomato flavor. In this paper, we compared such models calculated by the traditional statistical approach, stepwise regression, with models obtained by the new generation of regression techniques, known as penalized regression or regularization methods. In addition, for penalized regression, different scenarios and various model selection criteria were discussed to conclude that classical crossvalidation, selects models with many superfluous variables whereas model selection criteria such as Bayesian information criterion, seem to be more suitable, when the goal is to find parsimonious models, to explain tomato taste attributes based on metabolic information. An exhaustive comparison of the discussed methodology was done for six sensory traits, showing that the most important covariates were identified by the stepwise regression as well as by some of the penalized regression methods, despite the general disagreement on the size of the regression coefficients between them. In particular, for stepwise regression the coefficients are inflated due to their high variance which is not the case with penalized regression, showing that this new methodology, can be an alternative to obtain more accurate models.
Original languageEnglish
Pages (from-to)379-387
JournalEuphytica
Volume183
Issue number3
DOIs
Publication statusPublished - 2012

Fingerprint

Lycopersicon esculentum
tomatoes
metabolites
Patient Selection
Breeding
selection criteria
methodology
metabolomics
flavor
breeding

Keywords

  • lycopersicon-esculentum
  • nonvolatile components
  • organoleptic quality
  • selection
  • flavor
  • volatiles
  • lasso
  • identification
  • cultivars
  • traits

Cite this

@article{53647b9a07d54ba987907b5f8a26e1fc,
title = "Penalized regression techniques for modeling relationships between metabolites and tomato taste attributes",
abstract = "The search for models which link tomato taste attributes to their metabolic profiling, is a main challenge within the breeding programs that aim to enhance tomato flavor. In this paper, we compared such models calculated by the traditional statistical approach, stepwise regression, with models obtained by the new generation of regression techniques, known as penalized regression or regularization methods. In addition, for penalized regression, different scenarios and various model selection criteria were discussed to conclude that classical crossvalidation, selects models with many superfluous variables whereas model selection criteria such as Bayesian information criterion, seem to be more suitable, when the goal is to find parsimonious models, to explain tomato taste attributes based on metabolic information. An exhaustive comparison of the discussed methodology was done for six sensory traits, showing that the most important covariates were identified by the stepwise regression as well as by some of the penalized regression methods, despite the general disagreement on the size of the regression coefficients between them. In particular, for stepwise regression the coefficients are inflated due to their high variance which is not the case with penalized regression, showing that this new methodology, can be an alternative to obtain more accurate models.",
keywords = "lycopersicon-esculentum, nonvolatile components, organoleptic quality, selection, flavor, volatiles, lasso, identification, cultivars, traits",
author = "P. Menendez and P. Eilers and Y.M. Tikunov and A.G. Bovy and {van Eeuwijk}, F.",
year = "2012",
doi = "10.1007/s10681-011-0374-5",
language = "English",
volume = "183",
pages = "379--387",
journal = "Euphytica",
issn = "0014-2336",
publisher = "Springer Verlag",
number = "3",

}

Penalized regression techniques for modeling relationships between metabolites and tomato taste attributes. / Menendez, P.; Eilers, P.; Tikunov, Y.M.; Bovy, A.G.; van Eeuwijk, F.

In: Euphytica, Vol. 183, No. 3, 2012, p. 379-387.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Penalized regression techniques for modeling relationships between metabolites and tomato taste attributes

AU - Menendez, P.

AU - Eilers, P.

AU - Tikunov, Y.M.

AU - Bovy, A.G.

AU - van Eeuwijk, F.

PY - 2012

Y1 - 2012

N2 - The search for models which link tomato taste attributes to their metabolic profiling, is a main challenge within the breeding programs that aim to enhance tomato flavor. In this paper, we compared such models calculated by the traditional statistical approach, stepwise regression, with models obtained by the new generation of regression techniques, known as penalized regression or regularization methods. In addition, for penalized regression, different scenarios and various model selection criteria were discussed to conclude that classical crossvalidation, selects models with many superfluous variables whereas model selection criteria such as Bayesian information criterion, seem to be more suitable, when the goal is to find parsimonious models, to explain tomato taste attributes based on metabolic information. An exhaustive comparison of the discussed methodology was done for six sensory traits, showing that the most important covariates were identified by the stepwise regression as well as by some of the penalized regression methods, despite the general disagreement on the size of the regression coefficients between them. In particular, for stepwise regression the coefficients are inflated due to their high variance which is not the case with penalized regression, showing that this new methodology, can be an alternative to obtain more accurate models.

AB - The search for models which link tomato taste attributes to their metabolic profiling, is a main challenge within the breeding programs that aim to enhance tomato flavor. In this paper, we compared such models calculated by the traditional statistical approach, stepwise regression, with models obtained by the new generation of regression techniques, known as penalized regression or regularization methods. In addition, for penalized regression, different scenarios and various model selection criteria were discussed to conclude that classical crossvalidation, selects models with many superfluous variables whereas model selection criteria such as Bayesian information criterion, seem to be more suitable, when the goal is to find parsimonious models, to explain tomato taste attributes based on metabolic information. An exhaustive comparison of the discussed methodology was done for six sensory traits, showing that the most important covariates were identified by the stepwise regression as well as by some of the penalized regression methods, despite the general disagreement on the size of the regression coefficients between them. In particular, for stepwise regression the coefficients are inflated due to their high variance which is not the case with penalized regression, showing that this new methodology, can be an alternative to obtain more accurate models.

KW - lycopersicon-esculentum

KW - nonvolatile components

KW - organoleptic quality

KW - selection

KW - flavor

KW - volatiles

KW - lasso

KW - identification

KW - cultivars

KW - traits

U2 - 10.1007/s10681-011-0374-5

DO - 10.1007/s10681-011-0374-5

M3 - Article

VL - 183

SP - 379

EP - 387

JO - Euphytica

JF - Euphytica

SN - 0014-2336

IS - 3

ER -