首页> 外文期刊>Bioinformatics >Towards de novo identification of metabolites by analyzing tandem mass spectra.
【24h】

Towards de novo identification of metabolites by analyzing tandem mass spectra.

机译:通过分析串联质谱图来从头鉴定代谢物。

获取原文
获取原文并翻译 | 示例
           

摘要

MOTIVATION: Mass spectrometry is among the most widely used technologies in proteomics and metabolomics. Being a high-throughput method, it produces large amounts of data that necessitates an automated analysis of the spectra. Clearly, database search methods for protein analysis can easily be adopted to analyze metabolite mass spectra. But for metabolites, de novo interpretation of spectra is even more important than for protein data, because metabolite spectra databases cover only a small fraction of naturally occurring metabolites: even the model plant Arabidopsis thaliana has a large number of enzymes whose substrates and products remain unknown. The field of bio-prospection searches biologically diverse areas for metabolites which might serve as pharmaceuticals. De novo identification of metabolite mass spectra requires new concepts and methods since, unlike proteins, metabolites possess a non-linear molecular structure. RESULTS: In this work, we introduce a method for fully automated de novo identification of metabolites from tandem mass spectra. Mass spectrometry data is usually assumed to be insufficient for identification of molecular structures, so we want to estimate the molecular formula of the unknown metabolite, a crucial step for its identification. The method first calculates all molecular formulas that explain the parent peak mass. Then, a graph is build where vertices correspond to molecular formulas of all peaks in the fragmentation mass spectra, whereas edges correspond to hypothetical fragmentation steps. Our algorithm afterwards calculates the maximum scoring subtree of this graph: each peak in the spectra must be scored at most once, so the subtree shall contain only one explanation per peak. Unfortunately, finding this subtree is NP-hard. We suggest three exact algorithms (including one fixed parameter tractable algorithm) as well as two heuristics to solve the problem. Tests on real mass spectra show that the FPT algorithm and the heuristics solve the problem suitably fast and provide excellent results: for all 32 test compounds the correct solution was among the top five suggestions, for 26 compounds the first suggestion of the exact algorithm was correct. AVAILABILITY: http://www.bio.inf.uni-jena.de/tandemms
机译:动机:质谱分析是蛋白质组学和代谢组学中使用最广泛的技术之一。作为一种高通量方法,它会产生大量数据,因此需要对光谱进行自动分析。显然,可以容易地采用用于蛋白质分析的数据库搜索方法来分析代谢物质谱。但是对于代谢物,光谱的从头解释比对于蛋白质数据甚至更为重要,因为代谢物光谱数据库仅覆盖一小部分天然存在的代谢物:即使模型植物拟南芥也具有大量酶,其底物和产物仍然未知。生物勘探领域在生物多样性领域中寻找可能用作药物的代谢产物。从头开始鉴定代谢物质谱需要新的概念和方法,因为与蛋白质不同,代谢物具有非线性分子结构。结果:在这项工作中,我们介绍了一种从串联质谱图中全自动重新鉴定代谢物的方法。通常认为质谱数据不足以鉴定分​​子结构,因此我们想估算未知代谢物的分子式,这是鉴定其的关键步骤。该方法首先计算所有解释母峰质量的分子式。然后,建立一个图,其中顶点对应于碎片质谱中所有峰的分子式,而边对应于假设的碎片步骤。之后,我们的算法将计算该图的最大评分子树:光谱中的每个峰最多只能被评分一次,因此子树每个峰仅应包含一个解释。不幸的是,找到这个子树是NP困难的。我们建议使用三种精确算法(包括一种固定参数可处理算法)以及两种启发式算法来解决该问题。对真实质谱的测试表明,FPT算法和启发式算法可以快速解决问题,并提供出色的结果:对于所有32种测试化合物,正确的解决方案均在前五项建​​议中,对于26种化合物,精确算法的第一项建议是正确的。可用性:http://www.bio.inf.uni-jena.de/tandemms

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号