首页> 外文会议>International Conference on Research in Computational Molecular Biology >CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads
【24h】

CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads

机译:兼堆积:基于DNA成分的融合环境霰弹枪算法

获取原文
获取外文期刊封面目录资料

摘要

A major hindrance to studies of microbial diversity has been that the vast majority of microbes cannot be cultured in the laboratory and thus are not amenable to traditional methods of characterization. Environmental shotgun sequencing (ESS) overcomes this hurdle by sequencing the DNA from the organisms present in a microbial community. The interpretation of this metagenomic data can be greatly facilitated by associating every sequence read with its source organism. We report the development of CompostBin, a DNA composition-based algorithm for analyzing metagenomic sequence reads and distributing them into taxon-specific bins. Unlike previous methods that seek to bin assembled contigs and often require training on known reference genomes, CompostBin has the ability to accurately bin raw sequence reads without need for assembly or training. CompostBin uses a novel weighted PCA algorithm to project the high dimensional DNA composition data into an informative lower-dimensional space, and then uses the normalized cut clustering algorithm on this filtered data set to classify sequences into taxon-specific bins. We demonstrate the algorithm's accuracy on a variety of low to medium complexity data sets.
机译:对微生物多样性的研究的主要障碍是绝大多数微生物不能在实验室中培养,因此不适合传统的表征方法。环境霰弹枪测序(ESS)通过从中在微生物群落中存在的生物体中测序DNA来克服这种障碍。通过将每种序列与其源生物体相关联的每种序列,可以大大促进对这种聚蛋白数据的解释。我们报告了荚膜的发展,一种基于DNA成分的基于DNA成分的算法,用于分析肉群序列读取并将其分配到特定于分类的垃圾箱中。与以前寻求垃圾箱的方法不同,并且经常需要培训已知的参考基因组,堆肥素能够准确地钻取原始序列的读取,而无需装配或训练。 Compostbin使用一种新型加权PCA算法将高维DNA成分数据投影到信息的低维空间中,然后在该过滤的数据集上使用归一化切割聚类算法将序列分类为分类到特定于分类的垃圾箱。我们展示了算法对各种低到中等复杂性数据集的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号