首页> 美国卫生研究院文献>other >Background Adjusted Alignment-Free Dissimilarity Measures Improve the Detection of Horizontal Gene Transfer
【2h】

Background Adjusted Alignment-Free Dissimilarity Measures Improve the Detection of Horizontal Gene Transfer

机译:背景调整的无比对差异措施改善了水平基因转移的检测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Horizontal gene transfer (HGT) plays an important role in the evolution of microbial organisms including bacteria. Alignment-free methods based on single genome compositional information have been used to detect HGT. Currently, Manhattan and Euclidean distances based on tetranucleotide frequencies are the most commonly used alignment-free dissimilarity measures to detect HGT. By testing on simulated bacterial sequences and real data sets with known horizontal transferred genomic regions, we found that more advanced alignment-free dissimilarity measures such as CVTree and d2* that take into account the background Markov sequences can solve HGT detection problems with significantly improved performance. We also studied the influence of different factors such as evolutionary distance between host and donor sequences, size of sliding window, and host genome composition on the performances of alignment-free methods to detect HGT. Our study showed that alignment-free methods can predict HGT accurately when host and donor genomes are in different order levels. Among all methods, CVTree with word length of 3, d2* with word length 3, Markov order 1 and d2* with word length 4, Markov order 1 outperform others in terms of their highest F1-score and their robustness under the influence of different factors.
机译:水平基因转移(HGT)在包括细菌在内的微生物的进化中起着重要作用。基于单基因组组成信息的无比对方法已用于检测HGT。当前,基于四核苷酸频率的曼哈顿距离和欧几里得距离是检测HGT的最常用的无比对差异性度量。通过对具有已知水平转移基因组区域的模拟细菌序列和真实数据集进行测试,我们发现了更高级的无比对差异性度量,例如CVTree和 d 2 * 可以解决HGT检测问题,并显着提高性能。我们还研究了不同因素的影响,例如宿主与供体序列之间的进化距离,滑动窗口的大小以及宿主基因组组成对无比对方法检测HGT的性能的影响。我们的研究表明,当宿主和供体基因组处于不同顺序水平时,无比对方法可以准确预测HGT。在所有方法中,单词长度为3的CVTree d 2 * 字长为3,马尔可夫顺序为1和<数学xmlns:mml =“ http://www.w3.org/1998/Math/MathML” id =“ M3”溢出=“ scroll”> d 2 * 字长为4的Markov阶1在最高F1得分和在不同因素影响下的鲁棒性方面优于其他人。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号