...
首页> 外文期刊>Journal of computational biology: A journal of computational molecular cell biology >De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality Value-Based Algorithm
【24h】

De Novo Clustering of Long-Read Transcriptome Data Using a Greedy, Quality Value-Based Algorithm

机译:使用贪婪,基于质量值算法的长读转录组数据进行Novo聚类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Long-read sequencing of transcripts with Pacific Biosciences (PacBio) Iso-Seq and Oxford Nanopore Technologies has proven to be central to the study of complex isoform landscapes in many organisms. However, current de novo transcript reconstruction algorithms from long-read data are limited, leaving the potential of these technologies unfulfilled. A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin. To address this challenge, we develop isONclust, a clustering algorithm that is greedy (to scale) and makes use of quality values (to handle variable error rates). We test isONclust on three simulated and five biological data sets, across a breadth of organisms, technologies, and read depths. Our results demonstrate that isONclust is a substantial improvement over previous approaches, both in terms of overall accuracy and/or scalability to large data sets.
机译:与太平洋生物科学的成绩单(PACBIO)ISO-SEQ和牛津纳米孔技术的长度读取测序已被证明是许多生物体中复杂同种型景观的研究。 然而,来自长读数据的当前De Novo转录物重建算法有限,留下了这些技术的潜力。 常见的瓶颈是根据其基因家族的聚类长读取的可扩展和准确算法的缺乏算法。 为了解决这一挑战,我们开发了ISonClust,一种贪婪算法,贪婪(缩放)并使用质量值(处理变量错误率)。 我们在三个模拟和五个生物数据集上测试等级,横跨一系列的生物,技术和读取深度。 我们的结果表明,在总体准确性和/或大型数据集的可扩展性方面,ISONClust是对先前方法的大量改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号