ALL HITS ALL THE TIME:PARAMETER FREE CALCULATION OF SEED SENSITIVITY

机译：所有时间都击中：参数自由计算种子敏感性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Standard search techniques for DNA repeats start by identifying seeds, that is, small matching words, that may inhabit larger repeats. Recent innovations in seed structure have led to the development of spaced seeds [8] and indel seeds [9] which are more sensitive than contiguous seeds (also known as k-mers, k-tuples, 1-words, etc.). Evaluating seed sensitivity requires 1) specifying a homology model which describes types of alignments that can occur between two copies of a repeat, and 2) assigning probabilities to those alignments. Optimal seed selection is a resource intensive activity because essentially all alternative seeds must be tested [7]. Current methods require that the model and probability parameters be specified in advance. When the parameters change, the entire calculation has to be rerun. In this paper, we show how to eliminate the need for prior parameter specification. The ideas presented follow from a simple observation: given a homology model, the alignments hit by a particularseed remain the same regardless of the probability parameters. Only the weights assigned to those alignments change. Therefore, if we know all the hits, we can easily (and quickly) find optimal seeds. We describe a highly efficient preprocessing step, which is computed just once for each seed. In this calculation, strings which represent possible alignments are unweighted by any probability parameters. Then we show several increasingly efficient methods to find the optimal seed when given specific probability parameters. Indeed, we show how to determine exactly which seeds can never be optimal under any set of probability parameters. This leads to the startling observation that out of thousands of seeds, only a handful have any chance of being optimal.We then show how to find optimal seeds and the boundaries within probability space where they are optimal. We expect this method to greatly facilitate the study of seed space sensitivity, construction of multiple seed sets, and the use of alternative definitions of optimality.

机译：DNA的标准搜索技术通过识别种子来重复，即可能居住更大的重复的小匹配词。最近的种子结构的创新导致了间隔的种子[8]和吲哚种子[9]比连续种子更敏感（也称为K-MERS，K元组，1字等）。评估种子灵敏度需要1）指定描述可以在重复的两个副本和2）分配给这些对齐之间的对准类型的同源性模型。最佳种子选择是一种资源密集型活动，因为必须测试所有替代种子[7]。当前方法要求预先指定模型和概率参数。当参数发生变化时，整个计算必须重新运行。在本文中，我们展示了如何消除对先前参数规范的需求。遵循的想法从简单的观察开始：给定同源模型，无论概率参数如何，由特定的对齐保持相同。只分配给这些对齐的重量也会发生变化。因此，如果我们知道所有的命中，我们可以轻松地（并迅速）找到最佳种子。我们描述了一种高效的预处理步骤，每种种子仅计算一次。在该计算中，表示可能对准的字符串是由任何概率参数的减速的。然后我们展示了多个越来越有效的方法，以在给定特定概率参数时找到最佳种子。实际上，我们展示了如何在任何一组概率参数下确定哪些种子永远不会是最佳的。这导致了令人惊讶的观察，即在成千上万种子中，只有少数几乎没有最佳的机会。然后展示如何找到最佳种子和它们是最佳的概率空间内的边界。我们预计这种方法将大大促进种子空间敏感性，多种子套装的构建以及使用最优性定义的使用。

著录项

来源
《Asia-Pacific Bioinformatics Conference》|2007年||共14页
会议地点
作者
DENISE Y.F. MAK; GARY BENSON;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类生物信息论;
关键词

相似文献

外文文献
中文文献
专利

1. All hits all the time: parameter-free calculation of spaced seed sensitivity [J] . Mak DY, Benson G Bioinformatics . 2009,第3期

机译：始终保持所有命中：间隔种子灵敏度的无参数计算
2. All hits all the time: parameter-free calculation of spaced seed sensitivity [J] . Denise Y.F. Mak1* and Gary Benson23 Bioinformatics . 2009,第3期

机译：始终保持所有命中：间隔种子灵敏度的无参数计算
3. Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds [J] . Laurent Noé Algorithms for Molecular Biology . 2017,第1期

机译：11110110111的热门精选：无种子种子的无模型选择和无参数灵敏度计算
4. ALL HITS ALL THE TIME:PARAMETER FREE CALCULATION OF SEED SENSITIVITY [C] . DENISE Y.F. MAK, GARY BENSON Asia-Pacific Bioinformatics Conference . 2007

机译：所有时间都击中：参数自由计算种子敏感性
5. Environmental Sensitivity of Quantitative Trait Loci for Seed Germination and Flowering Time in Lettuce (Lactuca sativa L.) [D] . Niroula, Mohan. 2017

机译：数量性状位点对莴苣种子萌发和开花时间的环境敏感性
6. Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds [O] . Laurent Noé 2017

机译：11110110111的热门精选：无种子种子的无模型选择和无参数灵敏度计算
7. All Hits All The Time: Parameter Free Calculation of Spaced Seed Sensitivity [O] . Denise Y. F. Mak, Gary Benson 2009

机译：所有时间的所有命中：间隔种子灵敏度的参数自由计算
8. An Interactive Program for the Calculation and Analysis of the Parameter Sensitivities in a Linear, Time-Invariant System [R] . Palmer, L. K. 1981

机译：线性，时不变系统中参数灵敏度计算与分析的交互式程序

ALL HITS ALL THE TIME:PARAMETER FREE CALCULATION OF SEED SENSITIVITY

摘要

著录项

相似文献

相关主题

期刊订阅