首页> 外文会议>Proceedings of the 5th international workshop on Bioinformatics >Motif discovery for proteins using subsequence clustering
【24h】

Motif discovery for proteins using subsequence clustering

机译:使用子序列聚类发现蛋白质的母题

获取原文
获取原文并翻译 | 示例

摘要

We propose an algorithm for discovering motifs using clustering of subsequences. In our previous approach, we were successful in guiding motif discovery by sampling subsequences and inputting them to an existing motif discovery tool MEME. In this paper, we show that clustering subsequences can also detect motifs without using other motif discovery tools. Generally, motif discovery algorithms do not perform well when the input set consists of non-homogeneous sequences. Clustering tools have the inherent ability to generate clusters of homogeneous sequences when the input sequences are non-homogeneous. For this reason, we use our clustering algorithm to generate aligned subsequence clusters and then rank them according to their information contents to produce final motifs. The algorithm was tested with PROSITE database and the results suggest that the algorithm is very effective in finding motifs even when input sequences are from different protein families.
机译:我们提出了一种使用子序列聚类发现主题的算法。在我们以前的方法中,我们通过对子序列进行采样并将其输入到现有的主题发现工具MEME中,成功地指导了主题的发现。在本文中,我们证明了聚类子序列也可以检测主题,而无需使用其他主题发现工具。通常,当输入集由非均质序列组成时,基序发现算法不能很好地执行。当输入序列不均匀时,聚类工具具有生成均匀序列簇的固有能力。因此,我们使用聚类算法生成对齐的子序列聚类,然后根据其信息内容对它们进行排序,以生成最终的图案。该算法在PROSITE数据库中进行了测试,结果表明,即使输入序列来自不同的蛋白质家族,该算法在查找基序方面也非常有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号