首页> 外文会议>IFIP World Computer Congress >Statistical Method of Context Evaluation for Biological Sequence Similarity
【24h】

Statistical Method of Context Evaluation for Biological Sequence Similarity

机译:生物序列相似性语境评估的统计方法

获取原文

摘要

Within this paper we are proposing and testing a new strategy for detection and measurement of similarity between sequences of proteins. Our approach has its roots in computational linguistics and the related techniques for quantifying and comparing content in strings of characters. The pairwise comparison of proteins relies on the content regularities expected to uniquely characterize each sequence. These regularities are captured by n-gram based modelling techniques and exploited by cross-entropy related measures. In this new attempt to incorporate theoretical ideas from computational linguistics into the field of bioinformatics, we experimented using two implementations having always as ultimate goal the development of practical, computationally efficient algorithms for expressing protein similarity. The experimental analysis reported herein provides evidence for the usefulness of the proposed approach and motivates the further development of linguistics-related tools as a means of analysing biological sequences.
机译:在本文中,我们提出并测试了蛋白质序列之间的检测和测量的新策略。我们的方法是计算语言学的根源,以及用于量化和比较字符串中内容的相关技术。蛋白质的成对比较依赖于预期的唯一表征每个序列的内容规律。这些规则是由基于N-GRAM的建模技术捕获的,并通过跨熵相关措施利用。在这种新的尝试中,将计算语言学的理论思想纳入生物信息学的领域,我们使用两种实现作为最终目标的实施方式进行实际,计算有效的算法的发展,用于表达蛋白质相似性。本文报告的实验分析为拟议方法的有用性提供了证据,并激励与语言学相关工具的进一步发展作为分析生物序列的手段。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号