首页> 外文会议>International workshop on algorithms in bioinformatics >A Better Scoring Model for De Novo Peptide Sequencing: The Symmetric Difference Between Explained and Measured Masses
【24h】

A Better Scoring Model for De Novo Peptide Sequencing: The Symmetric Difference Between Explained and Measured Masses

机译:De Novo肽测序的更好评分模型:解释质量与测量质量之间的对称差异

获取原文

摘要

Given a peptide as a string of amino acids, the masses of all its prefixes and suffixes can be found by a trivial linear scan through the amino acid masses. The inverse problem is the ideal de novo peptide sequencing problem: Given all prefix and suffix masses, determine the string of amino acids. In biological reality, the given masses are measured in a lab experiment, and measurements by necessity are noisy. The (real, noisy) de novo peptide sequencing problem therefore has a noisy input: a few of the prefix and suffix masses of the peptide are missing and a few others are given in addition. For this setting we ask for an amino acid string that explains the given masses as accurately as possible. Past approaches interpreted accuracy by searching for a string that explains as many masses as possible. We feel, however, that it is not only bad to not explain a mass that appears, but also to explain a mass that does not appear. That is, we propose to minimize the symmetric difference between the set of given masses and the set of masses that the string explains. For this new optimization problem, we propose an efficient algorithm that computes both the best and the k best solutions. Experiments on measurements of 342 synthesized peptides show that our approach leads to better results compared to finding a string that explains as many given masses as possible.
机译:给定一个肽作为一串氨基酸,可以通过对氨基酸质量的简单线性扫描来找到其所有前缀和后缀的质量。反问题是理想的从头肽测序问题:给定所有前缀和后缀质量,确定氨基酸串。在生物现实中,给定的质量是在实验室实验中测量的,而根据需要进行的测量很嘈杂。因此,(真实的,嘈杂的)从头进行肽测序的问题有一个嘈杂的输入:缺少该肽的一些前缀和后缀质量,另外还给出了一些其他信息。对于此设置,我们要求提供一个氨基酸串,以尽可能准确地解释给定的质量。过去的方法是通过搜索解释尽可能多的质量的字符串来解释准确性。但是,我们感到,不仅不解释出现的质量,而且解释不出现的质量,都是不好的。也就是说,我们建议最小化给定质量集合和弦所解释的质量集合之间的对称差异。对于这个新的优化问题,我们提出了一种有效的算法,可以计算最佳解和k个最佳解。对342种合成肽进行测量的实验表明,与找到能解释尽可能多给定质量的字符串相比,我们的方法可获得更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号