首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >Cluster tree based multi-label classification for protein function prediction
【24h】

Cluster tree based multi-label classification for protein function prediction

机译:基于聚类树的蛋白质功能预测的多标签分类

获取原文

摘要

Automatically assigning functions for unknown proteins is a key task in computational biology. Proteins in nature have multiple classes according to the functions they perform. Many efforts have been made to cast the protein function prediction into a multi-label learning problem. This paper proposes a novel Cluster Tree based Multi-label Learning algorithm (CTML) for protein function prediction. The main idea is to compute a set of predictive labels associated at each node for multi-label prediction by using the k-means clustering techniques and the predictive functions via the learning data at the nodes. With the propagation of the predictive labels from the root node to the leaf node, the correlations between labels can be preserved. Experimental results on benchmark data (genbase and yeast datasets) show that the proposed CTML algorithm is effective in predicting protein functions. Moreover, the classification performance of the CTML algorithm is competitive against the other baseline multi-label learning algorithms.
机译:自动为未知蛋白质分配功能是计算生物学中的关键任务。自然界中的蛋白质根据其执行的功能而具有多种类别。为了将蛋白质功能预测转化为多标签学习问题,已经进行了许多努力。本文提出了一种新的基于聚类树的蛋白质功能预测的多标签学习算法(CTML)。主要思想是通过使用k-means聚类技术和通过节点处的学习数据的预测函数来计算在每个节点关联的一组预测标签,以进行多标签预测。通过将预测标签从根节点传播到叶节点,可以保留标签之间的相关性。对基准数据(基因库和酵母数据集)的实验结果表明,所提出的CTML算法可有效预测蛋白质功能。此外,CTML算法的分类性能与其他基线多标签学习算法相比具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号