首页> 美国卫生研究院文献>Nucleic Acids Research >Repeat or not repeat?—Statistical validation of tandem repeat prediction in genomic sequences

【2h】

Repeat or not repeat?—Statistical validation of tandem repeat prediction in genomic sequences

机译：重复还是不重复？—基因组序列中串联重复预测的统计验证

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Tandem repeats (TRs) represent one of the most prevalent features of genomic sequences. Due to their abundance and functional significance, a plethora of detection tools has been devised over the last two decades. Despite the longstanding interest, TR detection is still not resolved. Our large-scale tests reveal that current detectors produce different, often nonoverlapping inferences, reflecting characteristics of the underlying algorithms rather than the true distribution of TRs in genomic data. Our simulations show that the power of detecting TRs depends on the degree of their divergence, and repeat characteristics such as the length of the minimal repeat unit and their number in tandem. To reconcile the diverse predictions of current algorithms, we propose and evaluate several statistical criteria for measuring the quality of predicted repeat units. In particular, we propose a model-based phylogenetic classifier, entailing a maximum-likelihood estimation of the repeat divergence. Applied in conjunction with the state of the art detectors, our statistical classification scheme for inferred repeats allows to filter out false-positive predictions. Since different algorithms appear to specialize at predicting TRs with certain properties, we advise applying multiple detectors with subsequent filtering to obtain the most complete set of genuine repeats.

机译：串联重复（TR）代表基因组序列最普遍的特征之一。由于它们的丰富性和功能意义，在过去的二十年中，已经设计了许多检测工具。尽管长期以来一直关注，TR检测仍未解决。我们的大规模测试表明，当前的检测器会产生不同的，通常不重叠的推论，反映出基础算法的特征，而不是基因组数据中TR的真实分布。我们的仿真表明，检测TR的能力取决于它们的发散程度以及重复特性，例如最小重复单元的长度及其串联数。为了调和当前算法的各种预测，我们提出并评估了几种统计标准，用于测量预测重复单元的质量。特别是，我们提出了一个基于模型的系统发育分类器，要求对重复发散度进行最大似然估计。与最新的检测器结合使用，我们的推断重复统计分类方案可以过滤掉假阳性预测。由于不同的算法似乎擅长于预测具有某些属性的TR，因此我们建议应用多个检测器并进行后续滤波，以获得最完整的真实重复序列。

著录项

期刊名称 Nucleic Acids Research
作者
Elke Schaper; Andrey V. Kajava; Alain Hauser; Maria Anisimova;
展开▼
作者单位

展开▼
年(卷),期 2012(40),20
年度 2012
页码 10005–10017
总页数 13
原文格式 PDF
正文语种
中图分类分子生物学;
关键词

相似文献

外文文献
中文文献
专利

1. Repeat or not repeat?-Statistical validation of tandem repeat prediction in genomic sequences [J] . Schaper Elke, Kajava Andrey V., Hauser Alain, Nucleic Acids Research . 2012,第20期

机译：重复还是不重复？-基因组序列中串联重复预测的统计验证
2. Repeat or not repeat?—Statistical validation of tandem repeat prediction in genomic sequences [J] . Alain Hauser, Andrey V. Kajava, Elke Schaper, Nucleic acids research . 2012,第20期

机译：重复还是不重复？—基因组序列中串联重复预测的统计验证
3. Statistical Approaches to Detecting and Analyzing Tandem Repeats in Genomic Sequences [J] . Anisimova Maria, Pe?erska Julija, Schaper Elke Frontiers in Bioengineering and Biotechnology . 2015,第2期

机译：检测和分析基因组序列中串联重复序列的统计方法
4. Global repeat map algorithm (GRM) reveals differences in alpha satellite number of tandem and higher order repeats (HORs) in human, Neanderthal and chimpanzee genomes – novel tandem repeat database [C] . I. Vlahović, M. Glunčić, K. Dekanić, International Convention on Information, Communication and Electronic Technology . 2020

机译：全球重复图算法（GRM）揭示了人类，尼安德特人和黑猩猩基因组中α卫星串联数目和高阶重复（HORs）的差异–新的串联重复数据库
5. Identification of tandem repeats: Simple and complex pattern structures in DNA sequences. [D] . Hauth, Amy Michelle. 2002

机译：串联重复序列的鉴定：DNA序列中简单和复杂的模式结构。
6. Statistical Approaches to Detecting and Analyzing Tandem Repeats in Genomic Sequences [O] . Maria Anisimova, Julija Pečerska, Elke Schaper 2015

机译：检测和分析基因组序列中串联重复序列的统计方法
7. Repeat or not repeat?—Statistical validation of tandem repeat prediction in genomic sequences [O] . Elke Schaper, Andrey V. Kajava, Alain Hauser, 2012

机译：在基因组序列中重复或不重复或不重复串联重复预测的验证
8. Identification of Novel Inverted Terminal Repeat (ITR) Deletions of Human Adenovirus (AD) From Infected Host: Virulent Ads Containing Mixed Populations of Genomic Sequences; Conference paper [R] . Houng, H. H., Binn, L., Kuschner, R., 2006

机译：从受感染的宿主中鉴定新的人类腺病毒（aD）的倒置末端重复序列（ITR）：含有基因组序列的混合群体的病毒广告;会议论文

Repeat or not repeat?—Statistical validation of tandem repeat prediction in genomic sequences

摘要

著录项

相似文献

相关主题

期刊订阅