首页> 外文期刊>Journal of Bioinformatics and Computational Biology >A novel pattern matching algorithm for genomic patterns related to protein motifs
【24h】

A novel pattern matching algorithm for genomic patterns related to protein motifs

机译:一种新型模式匹配算法与蛋白质图案相关的基因组模式

获取原文
获取原文并翻译 | 示例
           

摘要

Patterns on proteins and genomic sequences are vastly analyzed, extracted and collected in databases. Although protein patterns originate from genomic coding regions, very few works have directly or indirectly dealt with coding region patterns induced from protein patterns. Results: In this paper, we have defined a new genomic pattern structure suitable for representing induced patterns from proteins. The provided pattern structure, which is called "Consecutive Positions Scoring Matrix (CPSSM)", is a replacement for protein patterns and profiles in the genomic context. CPSSMs can be identified, discovered, and searched in genomes. Then, we have presented a novel pattern matching algorithm between the defined genomic pattern and genomic sequences based on dynamic programming. In addition, we have modified the provided algorithm to support intronic gaps and huge sequences. We have implemented and tested the provided algorithm on real data. The results on Saccharomyces cerevisiae's genome show 132% more true positives and no false negatives and the results on human genome show no false negatives and 10 times as many true positives as those in previous works. Conclusion: CPSSM and provided methods could be used for open reading frame detection and gene finding. The application is available with source codes to run and download at http://app.foroughmand. ir/cpssm/.
机译:在数据库中大大分析,提取和收集蛋白质和基因组序列上的图案。虽然蛋白质模式来自基因组编码区,但是很少有效直接或间接地处理由蛋白质模式诱导的编码区域模式。结果:在本文中,我们已经确定了一种适用于代表来自蛋白质的诱导模式的新基因组图案结构。所提供的图案结构称为“连续位置评分矩阵(CPSSM)”是在基因组背景下的蛋白质模式和谱的替代品。可以在基因组中识别,发现和搜索CPSSMS。然后,我们在基于动态规划的定义基因组模式和基因组序列之间呈现了一种新的模式匹配算法。此外,我们已修改提供的算法以支持内部间隙和巨大序列。我们已经在实际数据上实施和测试了提供的算法。酿酒酵母的基因组的结果显示出132%的真实阳性,没有假阴性,人类基因组的结果表明,没有假的否定和10倍的真实阳性,因为以前的作品中的那些。结论:CPSSM和提供的方法可用于开放阅读帧检测和基因发现。该应用程序可在http://app.forhmand以http://app.forhmand运行和下载源代码。 IR / CPSSM /。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号