首页> 外文期刊>GigaScience >Common workflow language (CWL)-based software pipeline for de novo genome assembly from long- and short-read data
【24h】

Common workflow language (CWL)-based software pipeline for de novo genome assembly from long- and short-read data

机译:基于通用工作流语言(CWL)的软件管道,用于从长时和短时读取的数据进行从头基因组组装

获取原文
           

摘要

Background Here, we created an automated pipeline for the de novo assembly of genomes from Pacific Biosciences long-read and Illumina short-read data using common workflow language (CWL). To evaluate the performance of this pipeline, we assembled the nuclear genomes of the eukaryotes Caenorhabditis elegans (~100 Mb), Drosophila melanogaster (~138 Mb), and Plasmodium falciparum (~23 Mb) directly from publicly accessible nucleotide sequence datasets and assessed the quality of the assemblies against curated reference genomes. Findings We showed a dependency of the accuracy of assembly on sequencing technology and GC content and repeatedly achieved assemblies that meet the high standards set by the National Human Genome Research Institute, being applicable to gene prediction and subsequent genomic analyses. Conclusions This CWL pipeline overcomes current challenges of achieving repeatability and reproducibility of assembly results and offers a platform for the re-use of the workflow and the integration of diverse datasets. This workflow is publicly available via GitHub ( https://github.com/vetscience/Assemblosis ) and is currently applicable to the assembly of haploid and diploid genomes of eukaryotes.
机译:背景技术在这里,我们使用通用工作流语言(CWL)从Pacific Biosciences的长期阅读数据和Illumina短期阅读数据中创建了从头组装基因组的自动化管道。为了评估该管道的性能,我们直接从可公开获得的核苷酸序列数据集中,组装了真核细胞秀丽隐杆线虫(〜100 Mb),黑腹果蝇(〜138 Mb)和恶性疟原虫(〜23 Mb)的核基因组,并评估了针对策划的参考基因组的装配质量。研究结果我们证明了装配精度对测序技术和GC含量的依赖性,并反复获得了符合国家人类基因组研究所设定的高标准的装配,可用于基因预测和后续的基因组分析。结论该CWL流水线克服了当前实现装配结果可重复性和可再现性的挑战,并为重复使用工作流和集成各种数据集提供了平台。该工作流程可通过GitHub(https://github.com/vetscience/Assemblosis)公开获得,目前适用于真核生物的单倍体和二倍体基因组的组装。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号