In GC-MS-based metabolomics, comprehensive annotation of metabolite signals is required to describe a metabolic event occurring in a target organ. Many signals in raw metabolome data were identified based on unusual similarities of mass spectra with that of standards. Since there is inevitable noise in the observed spectra, a list of indentified metabolites includes some false positives. Evaluation of false discovery rates (FDRs) in lists of indentified metabolites is essential to minimize misinterpretation of metabolome data. In this study, a novel method for assessing the statistical significance of mass spectral similarities was developed by employing a modified BLAST (Karlin-Altchul) statistics. A similarity score of two mass spectra is calculated by a general scoring scheme, from which a probability to obtain the score by chance (P-value) is calculated using modified Karlin-Altchul statistics.
展开▼