首页> 外文期刊>BMC Bioinformatics >PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region
【24h】

PTIGS-IdIt, a system for species identification by DNA sequences of the psbA-trnH intergenic spacer region

机译:PTIGS-IdIt,一种通过psbA-trnH基因间隔区的DNA序列进行物种鉴定的系统

获取原文
           

摘要

BackgroundDNA barcoding technology, which uses a short piece of DNA sequence to identify species, has wide ranges of applications. Until today, a universal DNA barcode marker for plants remains elusive. The rbcL and matK regions have been proposed as the “core barcode” for plants and the ITS2 and psbA-trnH intergenic spacer (PTIGS) regions were later added as supplemental barcodes. The use of PTIGS region as a supplemental barcode has been limited by the lack of computational tools that can handle significant insertions and deletions in the PTIGS sequences. Here, we compared the most commonly used alignment-based and alignment-free methods and developed a web server to allow the biologists to carry out PTIGS-based DNA barcoding analyses.ResultsFirst, we compared several alignment-based methods such as BLAST and those calculating P distance and Edit distance, alignment-free methods Di-Nucleotide Frequency Profile (DNFP) and their combinations. We found that the DNFP and Edit-distance methods increased the identification success rate to ~80%, 20% higher than the most commonly used BLAST method. Second, the combined methods showed overall better success rate and performance. Last, we have developed a web server that allows (1) retrieving various sub-regions and the consensus sequences of PTIGS, (2) annotating novel PTIGS sequences, (3) determining species identity by PTIGS sequences using eight methods, and (4) examining identification efficiency and performance of the eight methods for various taxonomy groups.ConclusionsThe Edit distance and the DNFP methods have the highest discrimination powers. Hybrid methods can be used to achieve significant improvement in performance. These methods can be extended to applications using the core barcodes and the other supplemental DNA barcode ITS2. To our knowledge, the web server developed here is the only one that allows species determination based on PTIGS sequences. The web server can be accessed at http://psba-trnh-plantidit.dnsalias.org.
机译:背景技术DNA条形码技术使用一小段DNA序列来识别物种,具有广泛的应用范围。直到今天,用于植物的通用DNA条形码标记仍然难以捉摸。有人建议将rbcL和matK区域作为植物的“核心条形码”,随后将ITS2和psbA-trnH基因间隔子(PTIGS)区域添加为补充条形码。由于缺少能够处理PTIGS序列中大量插入和缺失的计算工具,限制了PTIGS区域作为补充条形码的使用。在这里,我们比较了最常用的基于比对和非比对的方法,并开发了一个Web服务器,以使生物学家能够进行基于PTIGS的DNA条形码分析。结果首先,我们比较了几种基于比对的方法,例如BLAST和计算P距离和编辑距离,无对齐方法,二核苷酸频率分布图(DNFP)及其组合。我们发现DNFP和编辑距离方法将识别成功率提高到〜80%,比最常用的BLAST方法高20%。其次,组合方法显示出总体上更好的成功率和性能。最后,我们开发了一种网络服务器,该服务器允许(1)检索PTIGS的各个子区域和共有序列,(2)注释新颖的PTIGS序列,(3)使用八种方法通过PTIGS序列确定物种同一性,以及(4)结论:编辑距离和DNFP方法具有最高的识别能力。可以使用混合方法来显着提高性能。这些方法可以扩展到使用核心条形码和其他补充DNA条形码ITS2的应用程序。据我们所知,这里开发的Web服务器是唯一一个可以基于PTIGS序列确定物种的服务器。可以从http://psba-trnh-plantidit.dnsalias.org访问Web服务器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号