首页> 外文期刊>Journal of biosciences >Identifying wrong assemblies in de novo short read primary sequence assembly contigs
【24h】

Identifying wrong assemblies in de novo short read primary sequence assembly contigs

机译:在从头读取短序列主序程序重叠群中识别错误的程序集

获取原文
           

摘要

With the advent of short-reads-based genome sequencing approaches, large number of organisms are being sequencedall over the world. Most of these assemblies are done using some de novo short read assemblers and other relatedapproaches. However, the contigs produced this way are prone to wrong assembly. So far, there is a conspicuousdearth of reliable tools to identify mis-assembled contigs. Mis-assemblies could result from incorrectly deleted orwrongly arranged genomic sequences. In the present work various factors related to sequence, sequencing andassembling have been assessed for their role in causing mis-assembly by using different genome sequencing data.Finally, some mis-assembly detecting tools have been evaluated for their ability to detect the wrongly assembledprimary contigs, suggesting a lot of scope for improvement in this area. The present work also proposes a simpleunsupervised learning-based novel approach to identify mis-assemblies in the contigs which was found performingreasonably well when compared to the already existing tools to report mis-assembled contigs. It was observed that theproposed methodology may work as a complementary system to the existing tools to enhance their accuracy.
机译:随着基于短读数的基因组测序方法的出现,世界各地正在对大量生物进行测序。这些组装中的大多数都是使用一些从头阅读的汇编器和其他相关方法完成的。但是,以这种方式生产的重叠群易于组装错误。到目前为止,有大量可靠的工具来识别错误组装的重叠群。错误组装或错误排列的基因组序列可能导致组装错误。在目前的工作中,已通过使用不同的基因组测序数据评估了与序列,测序和装配相关的各种因素在导致错配中的作用。最后,对一些错配检测工具已评估了它们检测错配的原重叠群的能力。 ,建议在此方面有很大的改进空间。本工作还提出了一种简单的,无监督的,基于学习的新颖方法来识别重叠群中的错配,与现有的报告错配重叠群的工具相比,该方法被发现性能良好。据观察,所提出的方法可以作为现有工具的补充系统,以提高其准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号