The zip file contains the raw results of the 20-fold cross validation, the raw data downloaded from GNPS on 01-11-2022 and all the test data splits made.
The raw results are stored in JSON format. The results of MS2Query and the benchmarking methods (cosine, modified cosine and MS2Deepscore) are stored. For each test spectrum three values are stored. The first value is the predicted score, the second value the tanimoto score between the correct annotation and the prediction and the last value is a boolean showing if the predicted spectrum was an exact 2D structure match.
The figures in the MS2Query paper can be reproduced using the functions in https://github.com/iomega/ms2query/blob/main/ms2query/benchmarking/create_accuracy_vs_recall_plot.py