Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage

Louis Ranjard; Thomas K. F. Wong; Allen G. Rodrigo

首页> 外文期刊>BMC Bioinformatics >Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage

【24h】

Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage

机译：用于下一代扩增子测序的有效机器学习组件，具有非常低的覆盖范围

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

BACKGROUND:In short-read DNA sequencing experiments, the read coverage is a key parameter to successfully assemble the reads and reconstruct the sequence of the input DNA. When coverage is very low, the original sequence reconstruction from the reads can be difficult because of the occurrence of uncovered gaps. Reference guided assembly can then improve these assemblies. However, when the available reference is phylogenetically distant from the sequencing reads, the mapping rate of the reads can be extremely low. Some recent improvements in read mapping approaches aim at modifying the reference according to the reads dynamically. Such approaches can significantly improve the alignment rate of the reads onto distant references but the processing of insertions and deletions remains challenging.RESULTS:Here, we introduce a new algorithm to update the reference sequence according to previously aligned reads. Substitutions, insertions and deletions are performed in the reference sequence dynamically. We evaluate this approach to assemble a western-grey kangaroo mitochondrial amplicon. Our results show that more reads can be aligned and that this method produces assemblies of length comparable to the truth while limiting error rate when classic approaches fail to recover the correct length. Finally, we discuss how the core algorithm of this method could be improved and combined with other approaches to analyse larger genomic sequences.CONCLUSIONS:We introduced an algorithm to perform dynamic alignment of reads on a distant reference. We showed that such approach can improve the reconstruction of an amplicon compared to classically used bioinformatic pipelines. Although not portable to genomic scale in the current form, we suggested several improvements to be investigated to make this method more flexible and allow dynamic alignment to be used for large genome assemblies.

机译：背景：在短读取的DNA测序实验中，读取覆盖是成功组装读取的关键参数并重建输入DNA的序列。当覆盖率非常低时，由于未覆盖的间隙发生，因此来自读取的原始序列重建可能很困难。然后，参考引导组件可以改善这些组件。然而，当可用的参考文献是远离测序读取的时，读取的映射率可以极低。读取映射的一些最新改进方法目的是根据读取动态修改参考。这些方法可以显着提高读取到遥远参考的读取的对准率，但是插入和删除的处理仍然存在挑战。结果：这里，我们介绍了一种新的算法根据先前对齐的读取更新参考序列。替换，插入和删除动态地在参考序列中进行。我们评估这种方法来组装西灰袋鼠线粒体扩增子。我们的结果表明，可以对齐更多的读取，并且当经典方法无法恢复正确长度时，该方法可以在限制错误率的同时产生与真相相当的长度相当的组件。最后，我们讨论如何改进该方法的核心算法以及与其他方法进行改进，以分析较大的基因组序列。结论：我们介绍了一种算法，用于在远处参考上执行读取的动态对准。我们表明，与经典使用的生物信息管道相比，这种方法可以改善扩增子的重建。虽然不便于当前形式的基因组规模，但我们建议研究了几种改进，以使该方法更加灵活，并且允许动态对准用于大型基因组组件。

著录项

来源
《BMC Bioinformatics》 |2019年第1期|共12页
作者
Louis Ranjard; Thomas K. F. Wong; Allen G. Rodrigo;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词
AssemblyAmpliconMachine learningWesterngrey kangarooMitochondrion;

机译：AssemblyAmpliconmachine学习Westerngrey Kangaroomitochong;

相似文献

外文文献
中文文献
专利

1. Next-generation sequencing of custom amplicons to improve coverage of HaloPlex multigene panels [J] . Coonrod Emily M., Durtschi Jacob D., Webb Chad VanSant, BioTechniques . 2014,第4期

机译：定制扩增子的下一代测序可提高HaloPlex多基因面板的覆盖率
2. Fast and cost-effective single nucleotide polymorphism (SNP) detection in the absence of a reference genome using semideep next-generation Random Amplicon Sequencing (RAMseq) [J] . Bayerl Helmut, Kraus Robert H. S., Nowak Carsten, Molecular ecology resources . 2018,第1期

机译：快速且经济高效的单核苷酸多态性（SNP）在不存在参考基因组的情况下使用Semidep下一代随机扩增子测序检测（Ramseq）
3. Amplicon-based next-generation sequencing: an effective approach for the molecular diagnosis of epidermolysis bullosa [J] . Tenedini E., Artuso L., Bernardis I., British Journal of Dermatology . 2015,第3期

机译：基于扩增子的下一代测序：大疱表皮松解分子诊断的有效方法
4. A Streamlined Library Preparation Workflow for Single Molecule Real Time Sequencing (SMRT) of HIV-1 Amplicons Ranging from 1kb to 8.9kb [C] . Ashley Hayes, Duylinh Nguyen, Monica Herrera, International Molecular Medicine Tri-Conference. . 2016

机译：用于单分子的简化图书馆准备工作流程，用于单分子的HIV-1扩增子的实时排序（SMRT）范围从1KB到8.9KB
5. Leveraging Third Generation Sequencing and Novel Sequence Analysis Algorithms for Rapid and Efficient Amplicon-Based Detection of Foodborne Pathogens [D] . Futral, Alexandra N. 2020

机译：利用第三代测序和新型序列分析算法，用于快速高效的基于扩增子检测的食物载体病原体
6. Correction to: Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage [O] . Louis Ranjard, Thomas K. F. Wong, Allen G. Rodrigo 2020

机译：校正至：适用于下一代扩增子测序的有效机器学习组件覆盖率极低
7. Effective Machine-Learning Assembly For Next-Generation Sequencing With Very Low Coverage [O] . Louis Ranjard, Thomas K. F. Wong, Allen G. Rodrigo 2018

机译：用于下一代测序的有效机器学习组件，具有非常低的覆盖范围

Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage

摘要

著录项

相似文献

相关主题

期刊订阅