Approaches to sample size determination for multivariate data: Applications to PCA and PLS-DA of omics data

Edoardo Saccenti*, Marieke E. Timmerman

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

20 Citations (Scopus)

Abstract

Sample size determination is a fundamental step in the design of experiments. Methods for sample size determination are abundant for univariate analysis methods, but scarce in the multivariate case. Omics data are multivariate in nature and are commonly investigated using multivariate statistical methods, such as principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA). No simple approaches to sample size determination exist for PCA and PLS-DA. In this paper we will introduce important concepts and offer strategies for (minimally) required sample size estimation when planning experiments to be analyzed using PCA and/or PLS-DA.

Original languageEnglish
Pages (from-to)2379-2393
Number of pages1
JournalJournal of Proteome Research
Volume15
Issue number8
DOIs
Publication statusPublished - 2016

Keywords

  • covariance estimation
  • dimensionality
  • eigenvalue distribution
  • hypothesis testing
  • loading estimation
  • multivariate analysis
  • power analysis
  • random matrix theory

Fingerprint Dive into the research topics of 'Approaches to sample size determination for multivariate data: Applications to PCA and PLS-DA of omics data'. Together they form a unique fingerprint.

  • Cite this