【24h】

Multi-instance clustering with applications to multi-instance prediction

机译:多实例聚类及其在多实例预测中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

In the setting of multi-instance learning, each object is represented by a bag composed of multiple instances instead of by a single instance in a traditional learning setting. Previous works in this area only concern multi-instance prediction problems where each bag is associated with a binary (classification) or real-valued (regression) label. However, unsupervised multi-instance learning where bags are without labels has not been studied. In this paper, the problem of unsupervised multi-instance learning is addressed where a multi-instance clustering algorithm named Bamic is proposed. Briefly, by regarding bags as atomic data items and using some form of distance metric to measure distances between bags, Bamic adapts the popular k -Medoids algorithm to partition the unlabeled training bags into k disjoint groups of bags. Furthermore, based on the clustering results, a novel multi-instance prediction algorithm named Bartmip is developed. Firstly, each bag is re-represented by a k-dimensional feature vector, where the value of the i-th feature is set to be the distance between the bag and the medoid of the i-th group. After that, bags are transformed into feature vectors so that common supervised learners are used to learn from the transformed feature vectors each associated with the original bag's label. Extensive experiments show that Bamic could effectively discover the underlying structure of the data set and Bartmip works quite well on various kinds of multi-instance prediction problems.
机译:在多实例学习的环境中,每个对象都由一个由多个实例组成的包表示,而不是在传统学习环境中由单个实例代表。该领域的先前工作仅涉及多实例预测问题,其中每个袋子都与二进制(分类)或实值(回归)标签相关联。但是,尚未研究袋没有标签的无监督多实例学习。在本文中,提出了一种称为Bamic的多实例聚类算法,解决了无监督多实例学习的问题。简而言之,通过将袋子视为原子数据项并使用某种形式的距离度量来测量袋子之间的距离,Bamic调整了流行的k -Medoids算法,将未标记的训练袋子划分为k个不相交的袋子组。此外,基于聚类结果,开发了一种称为Bartmip的新型多实例预测算法。首先,每个袋子由一个k维特征向量表示,其中第i个特征的值设置为袋子与第i个类群之间的距离。之后,将袋子转换为特征向量,以便普通的受监督学习者从转换后的特征向量中学习,每个向量都与原始袋子的标签相关联。大量的实验表明,Bamic可以有效地发现数据集的底层结构,而Bartmip可以很好地解决各种多实例预测问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号