Detecting overlapping coding sequences with pairwise alignments

Firth AE; Brown CM

首页> 外文期刊>Bioinformatics >Detecting overlapping coding sequences with pairwise alignments

【24h】

Detecting overlapping coding sequences with pairwise alignments

机译：使用成对比对检测重叠的编码序列

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motivation: Overlapping gene coding sequences (CDSs) are particularly common in viruses but also occur in more complex genomes. Detecting such genes with conventional gene-finding algorithms can be difficult for several reasons. If an overlapping CDS is on the same read-strand as a known CDS, then there may not be a distinct promoter or mRNA. Furthermore, the constraints imposed by double-coding can result in atypical codon biases. However, these same constraints lead to particular mutation patterns that may be detectable in sequence alignments.Results: In this paper, we investigate several statistics for detecting double-coding sequences with pairwise alignments-including a new maximum-likelihood method. We also develop a model for double-coding sequence evolution. Using simulated sequences generated with the model, we characterize the distribution of each statistic as a function of sequence composition, length, divergence time and double-coding frame. Using these results, we develop several algorithms for detecting overlapping CDSs.The algorithms were tested on known overlapping CDSs and other overlapping open reading frames (ORFs) in the hepatitis B virus (HBV), Escherichia coli and Salmonella typhimurium genomes. The algorithms should prove useful for detecting novel overlapping genes-especially short coding ORFs in viruses.

机译：动机：重叠的基因编码序列（CDS）在病毒中特别常见，但也存在于更复杂的基因组中。由于多种原因，使用常规基因发现算法检测此类基因可能很困难。如果重叠的CDS与已知的CDS在同一条阅读链上，则可能没有明显的启动子或mRNA。此外，由双重编码施加的约束可能导致非典型的密码子偏倚。但是，这些相同的约束导致可能在序列比对中检测到特定的突变模式。结果：在本文中，我们研究了几种用于检测成对比对的双编码序列的统计数据，包括一种新的最大似然法。我们还开发了双编码序列进化模型。使用由模型生成的模拟序列，我们将每个统计信息的分布表征为序列组成，长度，发散时间和双编码帧的函数。利用这些结果，我们开发了几种检测重叠CDS的算法，并在乙型肝炎病毒（HBV），大肠杆菌和鼠伤寒沙门氏菌基因组中的已知重叠CDS和其他重叠开放阅读框（ORF）上对该算法进行了测试。该算法应证明对检测新型重叠基因特别是病毒中的短编码ORF有用。

著录项

来源
《Bioinformatics》 |2005年第3期|共11页
作者
Firth AE; Brown CM;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物科学;
关键词
GENES; VIRUS; SUBSTITUTION;

机译：基因;病毒;替代;

相似文献

外文文献
中文文献
专利

1. Detecting overlapping coding sequences with pairwise alignments [J] . Firth AE, Brown CM Bioinformatics . 2005,第3期

机译：使用成对比对检测重叠的编码序列
2. CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments [J] . Carol L. Ecale Zhou Source Code for Biology Medicine . 2015,第1期

机译：CombAlign：用于从一组基于成对结构的序列比对中生成一对多序列比对的代码
3. Detecting overlapping coding sequences in virus genomes [J] . Andrew E Firth, Chris M Brown BMC Bioinformatics . 2006,第1期

机译：检测病毒基因组中的重叠编码序列
4. Pairwise sequence alignment for very long sequences on GPUs [C] . Junjie Li, Ranka S., Sahni S. Computational Advances in Bio and Medical Sciences (ICCABS), 2012 IEEE 2nd International Conference on . 2012

机译：成对序列比对可在GPU上很长的序列
5. Generic C++ implementations of pairwise sequence alignment: Instantiation for global alignment. [D] . Zhang, Yan. 2003

机译：成对序列比对的通用C ++实现：全局比对的实例化。
6. CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments [O] . Carol L. Ecale Zhou 2015

机译：CombAlign：用于从一组基于成对结构的序列比对中生成一对多序列比对的代码
7. CombAlign: a code for generating a one-to-many sequence alignment from a set of pairwise structure-based sequence alignments [O] . 2015

机译：CombAlign：用于从一组基于成对结构的序列比对中生成一对多序列比对的代码

Detecting overlapping coding sequences with pairwise alignments

摘要

著录项

相似文献

相关主题

期刊订阅