首页> 中文期刊> 《软件学报》 >邻域种子的启发式454序列聚类方法∗

邻域种子的启发式454序列聚类方法∗

         

摘要

随着二代测序技术的发展,产生了海量16S rRNA基因序列数据。如何有效地挖掘这些数据中隐藏的基因组学信息,是当前研究的热点与难点。序列聚类研究如何将来源于同一物种的序列合并在一起,其构成了物种多样性、结构及功能多样性研究的基础。针对454测序误差的来源特点,提出一种基于邻域种子序列的启发式序列聚类算法(NbHClust)。实验结果表明,该算法具有良好的鲁棒性能。与传统启发式序列聚类算法相比,该算法能够降低操作分类单元(operational taxonomy unit,简称OTU)过估计问题,提高聚类精度,有效地进行操作分类单元计算。%With the development of next-generation sequencing technology, a large number of 16S rRNA gene reads have been collected. A key and important issue is to develop novel methods for mining the hidden information among those data. Sequence clustering aims to find the natural groups of large-scale data which can help us to understand the species, functional and structural diversity of microbial communities. This present work proposes a heuristic clustering method based on Neighbor-seeds, named NbHClust, for 454 sequencing data. The results show that this method can reduce extent of overestimation of operational taxonomy unit (OTU) and have a good robust and high clustering accuracy.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号