首页> 外文OA文献 >MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes
【2h】

MGEScan-non-LTR: computational identification and classification of autonomous non-LTR retrotransposons in eukaryotic genomes

机译:MGEScan-non-LTR:真核基因组中自主非LTR逆转座子的计算鉴定和分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Computational methods for genome-wide identification of mobile genetic elements (MGEs) have become increasingly necessary for both genome annotation and evolutionary studies. Non-long terminal repeat (non-LTR) retrotransposons are a class of MGEs that have been found in most eukaryotic genomes, sometimes in extremely high numbers. In this article, we present a computational tool, MGEScan-non-LTR, for the identification of non-LTR retrotransposons in genomic sequences, following a computational approach inspired by a generalized hidden Markov model (GHMM). Three different states represent two different protein domains and inter-domain linker regions encoded in the non-LTR retrotransposons, and their scores are evaluated by using profile hidden Markov models (for protein domains) and Gaussian Bayes classifiers (for linker regions), respectively. In order to classify the non-LTR retrotransposons into one of the 12 previously characterized clades using the same model, we defined separate states for different clades. MGEScan-non-LTR was tested on the genome sequences of four eukaryotic organisms, Drosophila melanogaster, Daphnia pulex, Ciona intestinalis and Strongylocentrotus purpuratus. For the D. melanogaster genome, MGEScan-non-LTR found all known ‘full-length’ elements and simultaneously classified them into the clades CR1, I, Jockey, LOA and R1. Notably, for the D. pulex genome, in which no non-LTR retrotransposon has been annotated, MGEScan-non-LTR found a significantly larger number of elements than did RepeatMasker, using the current version of the RepBase Update library. We also identified novel elements in the other two genomes, which have only been partially studied for non-LTR retrotransposons.
机译:对于全基因组注释和进化研究来说,用于全基因组范围内的移动遗传元件(MGE)识别的计算方法变得越来越必要。非长末端重复(non-LTR)逆转座子是一类在大多数真核基因组中发现的MGE,有时数量非常多。在本文中,我们遵循通用隐马尔可夫模型(GHMM)的计算方法,介绍了一种计算工具MGEScan-non-LTR,用于鉴定基因组序列中的非LTR反转录转座子。三种不同的状态代表两个不同的蛋白质结构域和非LTR反转录转座子中编码的域间连接子区域,它们的分数分别通过使用隐式隐马尔可夫模型(对于蛋白质域)和高斯贝叶斯分类器(对于连接子区域)进行评估。为了使用同一模型将非LTR反转录转座子分类为12个先前表征的进化枝之一,我们为不同进化枝定义了单独的状态。 MGEScan-non-LTR在四种真核生物的基因组序列上进行了测试,果蝇是黑腹果蝇(Drosophila melanogaster),水蚤(Daphnia pulex),肠Ci(Ciona intestinalis)和紫圆虫(Strongylocentrotus purpuratus)。对于D. melanogaster基因组,MGEScan-non-LTR发现了所有已知的“全长”元件,并将它们同时分为进化枝CR1,I,赛马,LOA和R1。值得注意的是,对于D. pulex基因组,其中没有注释非LTR反转录转座子,使用RepBase Update库的当前版本,MGEScan-non-LTR发现的元素数量比RepeatMasker大得多。我们还确定了其他两个基因组中的新元素,仅对非LTR逆转座子进行了部分研究。

著录项

  • 作者

    Rho, Mina; Tang, Haixu;

  • 作者单位
  • 年度 2009
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号