首页> 外文会议>International Conference on Research in Computational Molecular Biology >De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality-Value Based Algorithm
【24h】

De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality-Value Based Algorithm

机译:使用贪婪的,基于质量值的算法对长期阅读的转录组数据进行从头聚类

获取原文

摘要

Long-read sequencing of transcripts with PacBio Iso-Seq and Oxford Nanopore Technologies has proven to be central to the study of complex isoform landscapes in many organisms. However, current de novo transcript reconstruction algorithms from long-read data are limited, leaving the potential of these technologies unfulfilled. A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin. To address this challenge, we develop isONclust, a clustering algorithm that is greedy (in order to scale) and makes use of quality values (in order to handle variable error rates). We test isONclust on three simulated and five biological datasets, across a breadth of organisms, technologies, and read depths. Our results demonstrate that isONclust is a substantial improvement over previous approaches, both in terms of overall accuracy and/or scalability to large datasets. Our tool is available at https://github.com/ ksahlin/isONclust.
机译:使用PacBio Iso-Seq和Oxford Nanopore Technologies进行的转录本的长读测序已被证明是研究许多生物中复杂亚型景观的关键。但是,当前从长期读取的数据进行的从头转录本重建算法受到限制,这使得这些技术的潜力无法实现。一个普遍的瓶颈是缺少可扩展和准确的算法,无法根据其基因家族将长读段聚类。为了解决这一挑战,我们开发了isONclust,这是一种贪婪的聚类算法(为了扩展)并利用质量值(为了处理可变错误率)。我们在三个模拟数据集和五个生物学数据集上测试isONclust,这些数据集涵盖了多种生物,技术和读取深度。我们的结果表明,就整体准确性和/或对大型数据集的可伸缩性而言,isONclust是对先前方法的重大改进。我们的工具可从https://github.com/ ksahlin / isONclust获得。

著录项

  • 来源
  • 会议地点 Washington(US)
  • 作者单位

    Department of Computer Science and Engineering Pennsylvania State University State College USA;

    Department of Computer Science and Engineering Pennsylvania State University State College USA Department of Biochemistry and Molecular Biology Pennsylvania State University State College USA Center for Computational Biology and Bioinformatics Pennsylvania State University State College USA;

  • 会议组织
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号