首页> 外文期刊>International Journal of Innovative Computing Information and Control >MINING SEQUENCE MOTIFS FROM PROTEIN DATABASES BASED ON A BIT PATTERN APPROACH
【24h】

MINING SEQUENCE MOTIFS FROM PROTEIN DATABASES BASED ON A BIT PATTERN APPROACH

机译:基于位模式的蛋白质数据库挖掘序列动机

获取原文
获取原文并翻译 | 示例
           

摘要

Proteins are the structural components of living cells and tissues, and thus an important building block in all living organisms. Sequence motifs in proteins are some subsequences which appear frequently. Motifs often denote important functional regions in proteins and can be used to characterize a protein family or discover the function of proteins. The SP-index algorithm was proposed to find sequence motifs containing gaps of arbitrary size. To find motifs, it constructs B-trees for indexing the occurring positions of short segments. Then, to check whether a long pattern composed of short segments appears frequently, the SP-index algorithm needs to test a large number of nodes of those B-trees, which may not be efficient. Therefore, in this paper, we propose the BitPattern-based (BP) algorithm to improve the efficiency of the SP-index algorithm. First, the BP algorithm transforms the protein sequences into bit patterns. Then, instead of testing a large number of nodes in the SP-index algorithm, the BP algorithm utilizes bit operations, i.e., AND, OR, shifting and masking, to efficiently find sequence motifs. The BP algorithm also performs a pruning step to reduce the processing time. From the experimental results on biological and synthetic data sets, we show that the BP algorithm needs shorter processing time than the SP-index algorithm.
机译:蛋白质是活细胞和组织的结构成分,因此是所有活生物体的重要组成部分。蛋白质中的序列基序是经常出现的一些子序列。基序通常表示蛋白质中的重要功能区,可用于表征蛋白质家族或发现蛋白质的功能。提出了SP索引算法来查找包含任意大小的缺口的序列基序。为了找到主题,它构造B树来索引短片段的出现位置。然后,为了检查由短段组成的长模式是否频繁出现,SP索引算法需要测试那些B树的大量节点,这可能不是有效的。因此,在本文中,我们提出了基于BitPattern(BP)的算法来提高SP-index算法的效率。首先,BP算法将蛋白质序列转换为位模式。然后,代替在SP索引算法中测试大量节点,BP算法利用位运算(即,与,或,移位和掩蔽)来有效地找到序列基序。 BP算法还执行修剪步骤以减少处理时间。从生物学和合成数据集的实验结果来看,我们表明BP算法比SP-index算法需要更短的处理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号