首页> 外文期刊>Current Genomics >Review of Common Sequence Alignment Methods: Clues to Enhance Reliability
【24h】

Review of Common Sequence Alignment Methods: Clues to Enhance Reliability

机译:常见序列比对方法综述:提高可靠性的线索

获取原文
获取原文并翻译 | 示例
           

摘要

Today, in various aspects of molecular biology, sequence alignment has become an essential tool to study the structure-function relationships of proteins. With the impressive increase of the number of available sequences, alignments provide a substantial piece of information by way of various computational methods. These approaches have generally become a crucial tool to put forward working hypotheses for time-consuming bench work, as protein engineering and site directed mutagenesis. However alignment methods remain hugely perfectible. All methods are dramatically limited in the twilight zone, taking place around 25% of identity between pairs of sequences. More worrying is the very high rate of false positive results generated by most algorithms, depending of empirical parameters, and hard to validate by statistical criteria.nnAfter reviewing the main methods, this paper draws user's attention to the fact that algorithm performance evaluations are entirely limited to alignment power (sensibility) evaluation. In reference to a given truth defined from alignment of know structures, the power is defined as the proportion of truth restored in the solution. The power may be overestimated by a lack of independent sets of poorly related sequences and its value depends entirely on the criterion used to define the truth. On the other hand, confidence (selectivity) represents the proportion of the solution that is true. Depending on the method and the parameters used, confidence may be much lower than power, and is usually never evaluated. For non-trivial alignments, when the power is high, confidence is low, which means that correctly aligned positions are embedded in large regions unduly aligned.nnOne possible solution to these problems is to use consensus of several multiple alignment methods, which will increase the confidence of the results. The addition of external information, such as the prediction of the secondary structure and / or the prediction of solvent accessibility is also an other way that should increase the performance of existing multiple alignment methods.
机译:如今,在分子生物学的各个方面,序列比对已成为研究蛋白质的结构-功能关系的重要工具。随着可用序列数量的惊人增加,比对通过各种计算方法提供了大量信息。这些方法通常已成为为蛋白质工程和定点诱变提出费时的基准工作的工作假设的重要工具。然而,对准方法仍然非常完美。所有方法都在暮光区受到极大限制,发生在成对序列之间大约25%的同一性。更令人担忧的是,大多数算法产生的假阳性结果的发生率很高,具体取决于经验参数,并且难以通过统计标准进行验证。nn在回顾了主要方法之后,本文提请用户注意以下事实:算法性能评估完全受限进行对准能力(敏感性)评估。关于根据已知结构的对齐定义的给定真相,功效定义为在解决方案中还原的真相比例。可能由于缺乏独立的,不良关联的序列集而高估了功效,其价值完全取决于用于定义真相的标准。另一方面,置信度(选择性)代表真实溶液的比例。根据所用方法和参数的不同,置信度可能远低于功效,并且通常不会进行评估。对于非平凡的对齐方式,当功率高时,置信度低,这意味着正确对齐的位置嵌入了未适当对齐的大区域中。nn解决这些问题的一种可能方法是使用几种多重对齐方式的共识,这将增加结果的信心。外部信息的添加,例如二级结构的预测和/或溶剂可及性的预测,也是增加现有多种比对方法性能的另一种方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号