首页> 外文期刊>Bioinformatics >Nucleotide composition string selection in HIV-1 subtyping using whole genomes
【24h】

Nucleotide composition string selection in HIV-1 subtyping using whole genomes

机译:使用全基因组在HIV-1亚型中选择核苷酸组成字符串

获取原文
获取原文并翻译 | 示例

摘要

Motivation: The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage.Results: We propose the complete composition vector representation for an HIV-1 strain, and a string scoring method to extract the nucleotide composition strings that contain the richest evolutionary information for phylogenetic analysis. In this way, a large-scale whole genome phylogenetic analysis for thousands of strains can be done both efficiently and effectively. By using 42 carefully curated strains as references, we apply our method to subtype 1156 HIV-1 strains (10.5 million nucleotides in total), which include 825 pure subtype strains and 331 recombinants. Our results show that our nucleotide composition string selection scheme is computationally efficient, and is able to define both pure subtypes and recombinant forms for HIV-1 strains using the 5000 top ranked nucleotide strings.
机译:动机:HIV-1病毒整个基因组序列的可用性为使用所有遗传物质研究HIV-1系统发育提供了极好的资源。然而,如此庞大的数据量在内存消耗和CPU使用率方面都带来了计算难题。结果:我们提出了HIV-1菌株的完整组成向量表示法,以及一种字符串计分方法来提取包含最丰富进化基因的核苷酸组成字符串用于系统发育分析的信息。以此方式,可以高效且有效地完成针对数千个菌株的大规模全基因组系统发育分析。通过使用42种精心挑选的菌株作为参考,我们将我们的方法应用于1156亚型HIV-1菌株(共1,050万个核苷酸),其中包括825个纯亚型菌株和331个重组体。我们的结果表明,我们的核苷酸组成字符串选择方案计算效率高,并且能够使用5000个排名靠前的核苷酸字符串定义HIV-1菌株的纯亚型和重组形式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号