...
首页> 外文期刊>BMC Bioinformatics >Inferring latent task structure for Multitask Learning by Multiple Kernel Learning
【24h】

Inferring latent task structure for Multitask Learning by Multiple Kernel Learning

机译:通过多核学习推断多任务学习的潜在任务结构

获取原文
           

摘要

Background The lack of sufficient training data is the limiting factor for many Machine Learning applications in Computational Biology. If data is available for several different but related problem domains, Multitask Learning algorithms can be used to learn a model based on all available information. In Bioinformatics, many problems can be cast into the Multitask Learning scenario by incorporating data from several organisms. However, combining information from several tasks requires careful consideration of the degree of similarity between tasks. Our proposed method simultaneously learns or refines the similarity between tasks along with the Multitask Learning classifier. This is done by formulating the Multitask Learning problem as Multiple Kernel Learning, using the recently published q -Norm MKL algorithm. Results We demonstrate the performance of our method on two problems from Computational Biology. First, we show that our method is able to improve performance on a splice site dataset with given hierarchical task structure by refining the task relationships. Second, we consider an MHC-I dataset, for which we assume no knowledge about the degree of task relatedness. Here, we are able to learn the task similarities ab initio along with the Multitask classifiers. In both cases, we outperform baseline methods that we compare against. Conclusions We present a novel approach to Multitask Learning that is capable of learning task similarity along with the classifiers. The framework is very general as it allows to incorporate prior knowledge about tasks relationships if available, but is also able to identify task similarities in absence of such prior information. Both variants show promising results in applications from Computational Biology.
机译:背景技术缺乏足够的训练数据是计算生物学中许多机器学习应用程序的限制因素。如果数据可用于几个不同但相关的问题域,则可以使用多任务学习算法基于所有可用信息来学习模型。在生物信息学中,可以通过合并来自多种生物的数据将许多问题引入多任务学习方案。但是,组合来自多个任务的信息需要仔细考虑任务之间的相似度。我们提出的方法与多任务学习分类器一起同时学习或改进了任务之间的相似性。这是通过使用最近发布的q -Norm MKL算法将多任务学习问题表述为多内核学习来完成的。结果我们证明了我们的方法在计算生物学的两个问题上的有效性。首先,我们证明了我们的方法能够通过改进任务关系来改善具有给定层次任务结构的拼接站点数据集的性能。其次,我们考虑一个MHC-I数据集,对于该数据集,我们假定不了解任务相关程度。在这里,我们能够从头开始了解多任务分类器的任务相似性。在这两种情况下,我们都优于我们所比较的基准方法。结论我们提出了一种新颖的多任务学习方法,能够与分类器一起学习任务相似性。该框架非常通用,因为它允许合并有关任务关系的现有知识(如果可用),但是也可以在缺少此类先验信息的情况下识别任务相似性。两种变体在计算生物学的应用中均显示出令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号