...
首页> 外文期刊>Evolutionary Applications >Optimization and performance testing of a sequence processing pipeline applied to detection of nonindigenous species
【24h】

Optimization and performance testing of a sequence processing pipeline applied to detection of nonindigenous species

机译:应用于非纲物种检测的序列处理管道的优化和性能测试

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Genetic taxonomic assignment can be more sensitive than morphological taxonomic assignment, particularly for small, cryptic or rare species. Sequence processing is essential to taxonomic assignment, but can also produce errors because optimal parameters are not known a priori. Here, we explored how sequence processing parameters influence taxonomic assignment of 18S sequences from bulk zooplankton samples produced by 454 pyrosequencing. We optimized a sequence processing pipeline for two common research goals, estimation of species richness and early detection of aquatic invasive species (AIS), and then tested most optimal models’ performances through simulations. We tested 1,050 parameter sets on 18S sequences from 20 AIS to determine optimal parameters for each research goal. We tested optimized pipelines’ performances (detectability and sensitivity) by computationally inoculating sequences of 20 AIS into ten bulk zooplankton samples from ports across Canada. We found that optimal parameter selection generally depends on the research goal. However, regardless of research goal, we found that metazoan 18S sequences produced by 454 pyrosequencing should be trimmed to 375–400?bp and sequence quality filtering should be relaxed (1.5?≤?maximum expected error?≤?3.0, Phred score?=?10). Clustering and denoising were only viable for estimating species richness, because these processing steps made some species undetectable at low sequence abundances which would not be useful for early detection of AIS. With parameter sets optimized for early detection of AIS, 90% of AIS were detected with fewer than 11 target sequences, regardless of whether clustering or denoising was used. Despite developments in next‐generation sequencing, sequence processing remains an important issue owing to difficulties in balancing false‐positive and false‐negative errors in metabarcoding data.
机译:遗传分类学作业可能比形态学分类分配更敏感,特别是对于小型,神秘或罕见的物种。序列处理对于分类分类分配至关重要,但也可以产生错误,因为最佳参数尚不清楚先验。在这里,我们探讨了序列处理参数如何影响到由454焦点测序产生的散装浮游动物样品的分类学分配18秒序列。我们优化了一个序列处理管道,用于两个常见的研究目标,物种丰富度估计和水生侵入物种(AIS)的早期检测,然后通过模拟测试最佳模型的性能。我们在20 AIS上测试了1,050个参数集,从20 AIS中确定每个研究目标的最佳参数。我们通过在加拿大跨境从端口计算20 AIS的序列来测试优化的管道的性能(可检测性和灵敏度)。我们发现最佳参数选择通常取决于研究目标。但是,无论研究目标如何,我们发现,454焦磷酸盐产生的甲基18S序列应修剪到375-400?BP和序列质量过滤应放宽(1.5?≤?最大预期误差?≤≤3.0,验证得分?= ?10)。聚类和去噪是估计物种丰富性的可行性,因为这些加工步骤在低序列丰度下不可检测到的物种,这对于早期检测AIS是不可用的。对于针对早期检测AIS的参数集,无论使用聚类还是去噪,少于11个靶序列,检测到90%的AIS。尽管下一代测序的发展,但序列处理仍然是由于在沟通数据中平衡假阳性和假阴性错误而困难的困难。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号