首页> 外文期刊>Current Science: A Fortnightly Journal of Research >Descriptors based on information theory for numerical characterization of DNA sequences
【24h】

Descriptors based on information theory for numerical characterization of DNA sequences

机译:基于信息论的描述符用于DNA序列的数值表征

获取原文
获取原文并翻译 | 示例
       

摘要

Descriptors based on information content (IC) are introduced to characterize nucleotide sequences. The descriptors are an extension of Shannon IC and are denoted as ICr, where r = 1, 2,..., n corresponding to the probability distribution of DNA strings of length 1, 2, etc. Sequence IC (SICr) and complementary IC (CSICr) are also introduced. IC saturates by reaching a maximum after a few orders and the order (string length) corresponding to the maximum IC value for a given sequence depends on the length of the DNA sequence. Effectiveness of the new descriptors in comparing similarity of DNA sequences was evaluated by performing phylogenetic analyses on first exons of 14 beta-globin genes, and complete coding sequences of 20 beta-globin genes. Dendrograms obtained using the descriptors were comparable to the classification of organisms according to the evolutionary tree. ICr, SICr and CSICr could be calculated without much demand for computation time even for very long DNA sequences.
机译:引入基于信息内容(IC)的描述符来表征核苷酸序列。描述符是Shannon IC的扩展,表示为ICr,其中r = 1、2,...,n对应于长度为1、2等的DNA字符串的概率分布。序列IC(SICr)和互补IC也介绍了(CSICr)。 IC通过几阶达到最大值而达到饱和,并且对应于给定序列的最大IC值的次序(字符串长度)取决于DNA序列的长度。通过对14个β-珠蛋白基因的第一个外显子和20个β-珠蛋白基因的完整编码序列进行系统发育分析,评估了新描述符在比较DNA序列相似性中的有效性。使用描述符获得的树状图与根据进化树对生物的分类相当。即使对于很长的DNA序列,也无需大量的计算时间即可计算ICr,SICr和CSICr。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号