Comparing Sequences with Segment Rearrangements

机译：将序列与段重排进行比较

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Computational genomics involves comparing sequences based on "similarity" for detecting evolutionary and functional relationships. Until very recently, available portions of the human genome sequence (and that of other species) were fairly short and sparse. Most sequencing effort was focused on genes and other short units; similarity between such sequences was measured based on character level differences. However with the advent of whole genome sequencing technology there is emerging consensus that the measure of similarity between long genome sequences must capture the rearrangements of large segments found in abundance in the human genome. In this paper, we abstract the general problem of computing sequence similarity in the presence of segment rearrangements. This problem is closely related to computing the smallest grammar for a string or the block edit distance between two strings. Our problem, like these other problems, is NP hard. Our main result here is a simple O(1) factor approximation algorithm for this problem. In contrast, best known approximations for the related problems are factor Ω(log n) off from the optimal. Our algorithm works in linear time, and in one pass. In proving our result, we relate sequence similarity measures based on different segment rearrangements, to each other, tight up to constant factors.

机译：计算基因组学涉及比较基于“相似性”的序列，以检测进化和功能关系。直到最近，人类基因组序列（以及其他物种）的可用部分还相当短且稀疏。大多数测序工作集中在基因和其他短单元上。这些序列之间的相似性是基于字符水平差异来测量的。然而，随着全基因组测序技术的出现，人们逐渐达成共识，即长基因组序列之间的相似性度量必须捕获人类基因组中大量存在的大片段的重排。在本文中，我们抽象了在片段重排的情况下计算序列相似性的一般问题。此问题与计算字符串的最小语法或两个字符串之间的块编辑距离密切相关。像其他问题一样，我们的问题也是NP难题。我们的主要结果是针对此问题的简单O（1）因子近似算法。相反，有关问题的最著名的近似值是最优值的Ω（log n）。我们的算法在线性时间中有效，并且一次通过。为了证明我们的结果，我们将基于不同片段重排的序列相似性度量彼此联系起来，并严格遵守恒定因子。

著录项

来源
《23rd Conference on Foundations of Software Technology and Theoretical Computer Science (FST TCS 2003); Dec 15-17, 2003; Mumbai, India》|2003年|p.183-194|共12页
会议地点 Mumbai(IN);Mumbai(IN)
作者
Funda Ergun; S. Muthukrishnan; S. Cenk Sahinalp;
展开▼
作者单位

Department of EECS, CWRU;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Comparing genomes with rearrangements and segmental duplications [J] . Shao Mingfu, Moret Bernard M. E. Bioinformatics . 2015,第12期

机译：比较基因组与重排和节段重复
2. Complexity of genome evolution by segmental rearrangement in Brassica rapa revealed by sequence-level analysis [J] . Martin Trick, Soo-Jin Kwon, Su R Choi, BMC Genomics . 2009,第1期

机译：序列水平分析揭示了甘蓝节段重排引起的基因组进化复杂性
3. The genomic distribution of intraspecific and interspecific sequence divergence of human segmental duplications relative to human/chimpanzee chromosomal rearrangements [J] . Tomàs Marques-Bonet, Ze Cheng, Xinwei She, BMC Genomics . 2008,第1期

机译：人类节段重复相对于人/黑猩猩染色体重排的种内和种间序列差异的基因组分布
4. Comparing Sequences with Segment Rearrangements [C] . Funda Ergun, S. Muthukrishnan, S. Cenk Sahinalp Conference on Foundations of Software Technology and Theoretical Computer Science . 2003

机译：将序列与分段重排进行比较
5. Rearrangements of repeated DNA sequences in Escherichia coli. [D] . Segal Morag, Aviv. 2004

机译：大肠杆菌中重复DNA序列的重排。
6. Comparing genomes with rearrangements and segmental duplications [O] . Mingfu Shao, Bernard M.E. Moret -1

机译：比较基因组与重排和节段重复
7. Comparing Sequences with Segment Rearrangements [O] . 2008

机译：将序列与段重排进行比较

Comparing Sequences with Segment Rearrangements

摘要

著录项

相似文献

相关主题

期刊订阅