首页> 外文OA文献 >Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes
【2h】

Pre- and post-sequencing recommendations for functional annotation of human fecal metagenomes

机译:用于人类粪便群体的功能注释的预先和后测序建议

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Abstract Background Shotgun metagenomes are often assembled prior to annotation of genes which biases the functional capacity of a community towards its most abundant members. For an unbiased assessment of community function, short reads need to be mapped directly to a gene or protein database. The ability to detect genes in short read sequences is dependent on pre- and post-sequencing decisions. The objective of the current study was to determine how library size selection, read length and format, protein database, e-value threshold, and sequencing depth impact gene-centric analysis of human fecal microbiomes when using DIAMOND, an alignment tool that is up to 20,000 times faster than BLASTX. Results Using metagenomes simulated from a database of experimentally verified protein sequences, we find that read length, e-value threshold, and the choice of protein database dramatically impact detection of a known target, with best performance achieved with longer reads, stricter e-value thresholds, and a custom database. Using publicly available metagenomes, we evaluated library size selection, paired end read strategy, and sequencing depth. Longer read lengths were acheivable by merging paired ends when the sequencing library was size-selected to enable overlaps. When paired ends could not be merged, a congruent strategy in which both ends are independently mapped was acceptable. Sequencing depths of 5 million merged reads minimized the error of abundance estimates of specific target genes, including an antimicrobial resistance gene. Conclusions Shotgun metagenomes of DNA extracted from human fecal samples sequenced using the Illumina platform should be size-selected to enable merging of paired end reads and should be sequenced in the PE150 format with a minimum sequencing depth of 5 million merge-able reads to enable detection of specific target genes. Expecting the merged reads to be 180-250 bp in length, the appropriate e-value threshold for DIAMOND would then need to be more strict than the default. Accurate and interpretable results for specific hypotheses will be best obtained using small databases customized for the research question.
机译:摘要背景霰弹枪MetageNomes通常在注释基因之前组装,这些基因偏向一个社区的功能能力,朝向其最丰富的成员。对于对社区功能的无偏见评估,需要将短读取直接映射到基因或蛋白质数据库。在短读取序列中检测基因的能力取决于排序前和排序后的决策。目前研究的目的是确定使用金刚石时的图书馆尺寸选择,读取长度和格式,蛋白质数据库,蛋白质数据库,蛋白质数据库,e-y值阈值和测序深度影响人类粪便微生物的对齐型微生物的分析比Blastx快20,000倍。结果使用从实验验证的蛋白序列数据库模拟的Metagenomes,发现读取长度,电子值阈值和蛋白质数据库的选择显着影响了已知目标的检测,具有更长的读取,更严格的电子值实现了最佳性能。阈值和自定义数据库。使用公开可用的MetageNomes,我们评估了库尺寸选择,配对结束读取策略和排序深度。当测序库被尺寸选择时,通过合并配对端来实现更长的读取长度以使能重叠。当成对的结束无法合并时,两端独立映射的一致策略是可以接受的。 500万合并读取的测序深度最小化了特定靶基因的丰度估计的误差,包括抗微生物抗性基因。结论应使用Illumina平台测序的人粪便样品中提取的DNA的DNA霰弹枪Metagenomes,以便能够合并成对的末端读数,并应以PE150格式在PE150格式中测序,最小测序深度为500万合并读取以启用检测特定靶基因。期待合并的读数为180-250 bp的长度,钻石的适当电子值阈值将需要比默认值更严格。使用针对研究问题定制的小型数据库,最佳地获得特定假设的准确和可解释的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号