...
首页> 外文期刊>FEMS Microbiology Letters >V-REVCOMP: Automated high-throughput detection of reverse complementary 16S rRNA gene sequences in large environmental and taxonomic datasets
【24h】

V-REVCOMP: Automated high-throughput detection of reverse complementary 16S rRNA gene sequences in large environmental and taxonomic datasets

机译:V-REVCOMP:大型环境和生物分类数据集中反向互补16S rRNA基因序列的自动化高通量检测

获取原文
获取原文并翻译 | 示例
           

摘要

Reverse complementary DNA sequences - sequences that are inadvertently given backwards with all purines and pyrimidines transposed - can affect sequence analysis detrimentally unless taken into account. We present an open-source, high-throughput software tool -v-revcomp - to detect and reorient reverse complementary entries of the small-subunit rRNA (16S) gene from sequencing datasets, particularly from environmental sources. The software supports sequence lengths ranging from full length down to the short reads that are characteristic of next-generation sequencing technologies. We evaluated the reliability of v-revcomp by screening all 406781 16S sequences deposited in release 102 of the curated SILVA database and demonstrated that the tool has a detection accuracy of virtually 100%. We subsequently used v-revcomp to analyse 1171646 16S sequences deposited in the International Nucleotide Sequence Databases and found that about 1% of these user-submitted sequences were reverse complementary. In addition, a nontrivial proportion of the entries were otherwise anomalous, including reverse complementary chimeras, sequences associated with wrong taxa, nonribosomal genes, sequences of poor quality or otherwise erroneous sequences without a reasonable match to any other entry in the database. Thus, v-revcomp is highly efficient in detecting and reorienting reverse complementary 16S sequences of almost any length and can be used to detect various sequence anomalies.
机译:反向互补DNA序列(无意中将所有嘌呤和嘧啶转位后向后赋予的序列)可能会不利地影响序列分析,除非考虑在内。我们提出了一种开源,高通量的软件工具-v-revcomp-从测序数据集中,尤其是从环境来源中检测和重定向小亚基rRNA(16S)基因的反向互补条目。该软件支持从全长到短读的序列长度,这是下一代测序技术的特征。我们通过筛选保存在策展的SILVA数据库的发行版102中的所有406781 16S序列,评估了v-revcomp的可靠性,并证明了该工具的检测精度几乎为100%。我们随后使用v-revcomp分析了保存在国际核苷酸序列数据库中的1171646 16S序列,发现这些用户提交的序列中约1%是反向互补的。另外,不平凡的条目是异常的,包括反向互补嵌合体,与错误的分类群相关的序列,非核糖体基因,质量较差的序列或其他与数据库中其他条目没有合理匹配的错误序列。因此,v-revcomp在检测和重新定向几乎任何长度的反向互补16S序列方面非常高效,可用于检测各种序列异常。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号