首页> 外文会议>IET International Conference on Information Science and Control Engineering >A NEW METHOD FOR CLUSTERING MULTI-DOMAIN PROTEIN SEQUENCES
【24h】

A NEW METHOD FOR CLUSTERING MULTI-DOMAIN PROTEIN SEQUENCES

机译:一种聚类多域蛋白序列的新方法

获取原文

摘要

A new method for clustering multi-domain protein sequences was proposed by revising preference value of classical affinity propagation (AP) algorithm combined by Silhouette index of clustering validity. At the same time, the classical substitution match similarity (SMS) between two protein sequences was generalized to meet the demand of clustering 'twilight zone' protein sequences. Experimental results on four test datasets demonstrate that our method can acquire number of clusters more approximate to the family number of clusters classified by the phylogenetic trees, more consistence clustering structure for a given dataset of proteins, and the comparatively advantage in clustering multi-domain protein sequences.
机译:通过修改通过聚类有效性的轮廓索引组合的经典亲和力传播(AP)算法的偏好值来提出一种聚类多域蛋白序列的新方法。同时,推广了两种蛋白质序列之间的经典取代相似性(SMS)以满足聚类'暮光区'蛋白序列的需求。四个测试数据集上的实验结果表明,我们的方法可以获得与系统发育树分类的群集的家庭数量更近似的簇数,更高的蛋白质数据集的一致聚类结构,以及聚类多域蛋白的相对优势序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号