...
首页> 外文期刊>Mobile DNA >The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families
【24h】

The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families

机译:UCSC重复浏览器允许在重复系列中发现和可视化进化冲突

获取原文

摘要

Nearly half the human genome consists of repeat elements, most of which are retrotransposons, and many of which play important biological roles. However repeat elements pose several unique challenges to current bioinformatic analyses and visualization tools, as short repeat sequences can map to multiple genomic loci resulting in their misclassification and misinterpretation. In fact, sequence data mapping to repeat elements are often discarded from analysis pipelines. Therefore, there is a continued need for standardized tools and techniques to interpret genomic data of repeats. We present the UCSC Repeat Browser, which consists of a complete set of human repeat reference sequences derived from annotations made by the commonly used program RepeatMasker. The UCSC Repeat Browser also provides an alignment from the human genome to these references, uses it to map the standard human genome annotation tracks, and presents all of them as a comprehensive interface to facilitate work with repetitive elements. It also provides processed tracks of multiple publicly available datasets of particular interest to the repeat community, including ChIP-seq datasets for KRAB Zinc Finger Proteins (KZNFs) – a family of proteins known to bind and repress certain classes of repeats. We used the UCSC Repeat Browser in combination with these datasets, as well as RepeatMasker annotations in several non-human primates, to trace the independent trajectories of species-specific evolutionary battles between LINE 1 retroelements and their repressors. Furthermore, we document at https://repeatbrowser.ucsc.edu how researchers can map their own human genome annotations to these reference repeat sequences. The UCSC Repeat Browser allows easy and intuitive visualization of genomic data on consensus repeat elements, circumventing the problem of multi-mapping, in which sequencing reads of repeat elements map to multiple locations on the human genome. By developing a reference consensus, multiple datasets and annotation tracks can easily be overlaid to reveal complex evolutionary histories of repeats in a single interactive window. Specifically, we use this approach to retrace the history of several primate specific LINE-1 families across apes, and discover several species-specific routes of evolution that correlate with the emergence and binding of KZNFs.
机译:近一半的人类基因组由重复元素组成,其中大多数是回析构件,其中许多许多都发挥着重要的生物学作用。然而,重复元素对当前的生物信息分析和可视化工具构成了几种独特的挑战,因为短重复序列可以映射到多个基因组基因座,导致其错误分类和误解。实际上,序列数据映射到重复元素通常从分析管道中丢弃。因此,持续需要标准化的工具和技术来解释重复的基因组数据。我们介绍了UCSC重复浏览器,该浏览器包括一组完整的人类重复参考序列,该序列来自常用的程序重复唤醒器所做的注释。 UCSC重复浏览器还提供与这些参考的人类基因组的对齐,使用它来映射标准人类基因组注释轨道,并将所有这些都作为综合界面,以便于有助于与重复元素一起工作。它还提供了对重复社区的多个公开可用数据集的处理后的曲目,包括Krab锌手指蛋白(KZNFS)的芯片-SEQ数据集 - 已知的蛋白质,已知粘合和压制某些类别的重复。我们使用UCSC重复浏览器与这些数据集结合使用,以及多个非人类灵长类动物的重复掩手注释,以追踪Line 1 Retelements及其阻遏物之间的物种特定进化战的独立轨迹。此外,我们在https://repeatbrowser.ucsc.edu上记录研究人员如何将自己的人类基因组注释映射到这些参考重复序列。 UCSC重复浏览器允许在共识重复元件上轻松和直观地可视化基因组数据,避免多映射的问题,其中重复元素映射到人类基因组上的多个位置。通过开发参考共识,可以容易地重叠多个数据集和注释轨迹,以揭示单个交互式窗口中的重复的复杂进化历史。具体而言,我们使用这种方法来追溯跨猿的几个灵长类专用线-1系列的历史,并发现几种特定的演变途径与KZNFS的出现和结合相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号