首页> 外文会议>2015 IEEE 2nd International Conference on Recent Trends in Information Systems >A species clustering method based on variation of molecular data with the aid of variance proportion
【24h】

A species clustering method based on variation of molecular data with the aid of variance proportion

机译:基于方差比例的分子数据变异的物种聚类方法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

In order to infer evolutionary relationships as well as reconstruct phylogenetic trees, evolutionists often employ two general approaches: character-based and distance-based. Inasmuch as character based methods could be inordinately expensive in computational process, researchers have to use some estimation methods with practical run time. In this context, distance based methods are exceedingly quicker due to the utilizing of distance matrices. In Computational Biology, sequence comparison is of fundamental importance which tries to find similar sequences. Many different techniques have been developed to calculate the right distance measure among DNA sequences, however, they are almost only used for making distance matrix; additionally, they usually work in the absence of using models of evolution too. In this paper, a novel technique, based on mathematical variance calculation, is proposed to show how much gene sequences in a group are all to be similar. In this strategy, we use mathematical formula of variance to acquire the average of differences amongst all sequences of a specific set (called cluster). Eventually, all sequences with variation lower than the predefined variance will be clustered into some groups while each group contains a phylogenetic tree. We are of the idea that our method, in spite of simplicity in design, could be used as a logical criterion to cluster sequences of DNA and it also could prove useful as a simple technique to build phylogenetic networks based on distance, especially when there are a large number of input sequences.
机译:为了推断进化关系并重建系统树,进化论者经常采用两种通用方法:基于字符的方法和基于距离的方法。由于基于字符的方法在计算过程中可能会非常昂贵,因此研究人员必须使用一些具有实际运行时间的估计方法。在这种情况下,基于距离的方法由于利用距离矩阵而非常快。在计算生物学中,序列比较至关重要,它试图找到相似的序列。已经开发出许多不同的技术来计算DNA序列之间的正确距离度量,但是,它们几乎仅用于制作距离矩阵。另外,它们通常在没有使用演化模型的情况下也可以工作。在本文中,提出了一种基于数学方差计算的新技术,以显示一组中多少个基因序列都相似。在这种策略中,我们使用方差的数学公式来获取特定集合(称为簇)的所有序列之间差异的平均值。最终,所有变异低于预定义方差的序列都将聚类为一些组,而每组都包含一个系统发育树。我们的想法是,尽管设计简单,我们的方法仍可以用作对DNA序列进行聚类的逻辑标准,并且也可以证明它是一种基于距离构建系统进化网络的简单技术,特别是当存在大量的输入序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号