首页> 美国卫生研究院文献>Computational and Structural Biotechnology Journal >Extended many-item similarity indices for sets of nucleotide and protein sequences
【2h】

Extended many-item similarity indices for sets of nucleotide and protein sequences

机译:核苷酸组和蛋白质序列的扩展了许多项目相似索引

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Quantification of similarities between protein sequences or DNA/RNA strands is a (sub-)task that is ubiquitously present in bioinformatics workflows, and is usually accomplished by pairwise comparisons of sequences, utilizing simple (e.g. percent identity) or more intricate concepts (e.g. substitution scoring matrices). Complex tasks (such as clustering) rely on a large number of pairwise comparisons under the hood, instead of a direct quantification of set similarities. Based on our recently introduced framework that enables multiple comparisons of binary molecular fingerprints (i.e., direct calculation of the similarity of fingerprint sets), here we introduce novel symmetric similarity indices for analogous calculations on sets of character sequences with more than two (t) possible items (e.g. DNA/RNA sequences with t = 4, or protein sequences with t = 20). The features of these new indices are studied in detail with analysis of variance (ANOVA), and demonstrated with three case studies of protein/DNA sequences with varying degrees of similarity (or evolutionary proximity). The Python code for the extended many-item similarity indices is publicly available at: https://github.com/ramirandaq/tn_Comparisons.
机译:蛋白质序列或DNA / RNA链之间的相似性的定量是普遍存在于生物信息学工作流程的(子)任务,并且通常通过对序列的成对比较来完成,利用简单(例如百分比)或更复杂的概念(例如替换评分矩阵)。复杂任务(例如聚类)依赖于引擎盖下的大量成对比较,而不是直接量化设置相似度。基于我们最近引入的框架,该框架能够多次比较二进制分子指纹(即指纹集的相似性),在这里,我们在具有两个以上(t)的字符序列集上的类似计算的新的对称相似性指数物品(例如具有T = 4的DNA / RNA序列,或具有T = 20的蛋白质序列)。通过对方差(ANOVA)的分析详细研究了这些新索引的特征,并用三种蛋白质/ DNA序列进行了证明,其具有不同程度的相似性(或进化接近)。扩展的许多项目相似性指数的Python代码可在:https://github.com/ramirandaq/tn_comparisons。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号