TY - JOUR
T1 - Bayesian approach to peak deconvolution and library search for high resolution gas chromatography - Mass spectrometry
AU - Barcaru, A.
AU - Mol, H.G.J.
AU - Tienstra, M.
AU - Vivó-Truyols, G.
PY - 2017
Y1 - 2017
N2 - A novel probabilistic Bayesian strategy is proposed to resolve highly coeluting peaks in high-resolution GC-MS (Orbitrap) data. Opposed to a deterministic approach, we propose to solve the problem probabilistically, using a complete pipeline. First, the retention time(s) for a (probabilistic) number of compounds for each mass channel are estimated. The statistical dependency between m/. z channels was implied by including penalties in the model objective function. Second, Bayesian Information Criterion (BIC) is used as Occam's razor for the probabilistic assessment of the number of components. Third, a probabilistic set of resolved spectra, and their associated retention times are estimated. Finally, a probabilistic library search is proposed, computing the spectral match with a high resolution library. More specifically, a correlative measure was used that included the uncertainties in the least square fitting, as well as the probability for different proposals for the number of compounds in the mixture. The method was tested on simulated high resolution data, as well as on a set of pesticides injected in a GC-Orbitrap with high coelution. The proposed pipeline was able to detect accurately the retention times and the spectra of the peaks. For our case, with extremely high coelution situation, 5 out of the 7 existing compounds under the selected region of interest, were correctly assessed. Finally, the comparison with the classical methods of deconvolution (i.e., MCR and AMDIS) indicates a better performance of the proposed algorithm in terms of the number of correctly resolved compounds.
AB - A novel probabilistic Bayesian strategy is proposed to resolve highly coeluting peaks in high-resolution GC-MS (Orbitrap) data. Opposed to a deterministic approach, we propose to solve the problem probabilistically, using a complete pipeline. First, the retention time(s) for a (probabilistic) number of compounds for each mass channel are estimated. The statistical dependency between m/. z channels was implied by including penalties in the model objective function. Second, Bayesian Information Criterion (BIC) is used as Occam's razor for the probabilistic assessment of the number of components. Third, a probabilistic set of resolved spectra, and their associated retention times are estimated. Finally, a probabilistic library search is proposed, computing the spectral match with a high resolution library. More specifically, a correlative measure was used that included the uncertainties in the least square fitting, as well as the probability for different proposals for the number of compounds in the mixture. The method was tested on simulated high resolution data, as well as on a set of pesticides injected in a GC-Orbitrap with high coelution. The proposed pipeline was able to detect accurately the retention times and the spectra of the peaks. For our case, with extremely high coelution situation, 5 out of the 7 existing compounds under the selected region of interest, were correctly assessed. Finally, the comparison with the classical methods of deconvolution (i.e., MCR and AMDIS) indicates a better performance of the proposed algorithm in terms of the number of correctly resolved compounds.
KW - Bayesian statistics
KW - Compound identification
KW - Deconvolution
KW - GC-Orbitrap data
KW - High resolution mass spectrometry
U2 - 10.1016/j.aca.2017.06.044
DO - 10.1016/j.aca.2017.06.044
M3 - Article
AN - SCOPUS:85021784428
SN - 0003-2670
VL - 983
SP - 76
EP - 90
JO - Analytica Chimica Acta
JF - Analytica Chimica Acta
ER -