Statistical modeling is an inherent part of any metabolomics study. Statistical models mainly assess the association between metabolites and the trait(s) of interest. Complicating factors are that not all metabolites are connected to the trait of interest, and modeling is hampered by the large numbers of metabolites especially in relation to the number of samples. To remedy this situation, several variable selection strategies that operate at different levels are discussed. Low-level variable selection is focused on removing noninformative or redundant metabolites. Medium-level variable selection involves methods that explicitly select a subset of most predictive metabolites. Lastly, high-level variable selection entails statistical techniques that select metabolites as part of their inner workings, or their importance is indicated using an auxiliary criterion. By selecting metabolites at these different levels, the complexity of the problem is reduced. This helps in statistical modeling and with the subsequent interpretation of the results. It helps researchers to focus on the most important metabolites that have a clear association with the trait under investigation.
|Title of host publication||Metabolomics Perspectives|
|Subtitle of host publication||From Theory to Practical Application|
|Publication status||Published - 18 Mar 2022|