An essential element of any strategy for non-targeted metabolomics analysis of complex biological extracts is the capacity to perform comparisons between large numbers of samples. As the most widely used technologies are all based on mass spectrometry (e.g. GCMS, LCMS), this entails that we must be able to compare reliably and (semi)automatically large series of chromatographic mass spectra from which compositional differences are to be extracted in a statistically justifiable manner. In this paper we describe a novel approach for the extraction of relevant information from multiple full-scan metabolic profiles derived from LC–MS analyses. Specifically-designed software has made it possible to combine all mass peaks on the basis of retention time and m/z values only, without prior identification, to produce a data matrix output which can then be used for multivariate statistical analysis. To demonstrate the capacity of this approach, aqueous methanol extracts from potato tuber tissues of eight contrasting genotypes, harvested at two developmental stages have been used. Our results showed that it is possible to discover reproducibly discriminatory mass peaks related both to the genetic origin of the material as well as the developmental stage at which it was harvested. In addition the limitations of the approach are explored by a careful evaluation of the alignment quality.