首页> 外文会议>IEEE International Conference on Data Science and Advanced Analytics >Entity-Level Stream Classification: Exploiting Entity Similarity to Label the Future Observations Referring to an Entity
【24h】

Entity-Level Stream Classification: Exploiting Entity Similarity to Label the Future Observations Referring to an Entity

机译:实体级流分类:利用实体相似性来标记引用实体的未来观察

获取原文

摘要

Stream classification algorithms traditionally treat arriving observations as independent. However, in many applications the arriving examples may depend on the "entity" that generated them, e.g. in product reviewing or in the interactions of users with an application server. In this study, we investigate the potential of this dependency by partitioning the original stream of observations into entity-centric substreams and by incorporating entity-specific information into the learning model. We propose a k Nearest Neighbour inspired stream classification approach (kNN), in which the label of an arriving observation is predicted by exploiting knowledge on the observations belonging to this entity and to entities similar to it. For the computation of entity similarity, we consider knowledge about the observations and knowledge about the entity, potentially transferred from another domain. To distinguish between cases where this kind of knowledge transfer is beneficial for stream classification and cases where the knowledge on the entities does not contribute to classifying the observations, we also propose a heuristic approach based on random sampling of substreams using k Random Entities (kRE). Our learning scenario is not fully supervised: after acquiring labels for the initial few observations of each entity, we assume that no additional labels arrive, and attempt to predict the labels of near-future and far-future observations from that initial seed. We report on our findings from three datasets.
机译:传统上,流分类算法将到达的观测视为独立的。但是,在许多应用中,到达的示例可能取决于产生它们的“实体”,例如在产品审查中或在用户与应用程序服务器的交互中。在这项研究中,我们通过将原始观察流划分为以实体为中心的子流并将特定于实体的信息合并到学习模型中来研究这种依赖性的可能性。我们提出了一种k最近邻启发式流分类方法(kNN),其中通过利用对属于该实体以及与之相似的实体的观测的知识来预测到达的观测的标签。为了计算实体相似度,我们考虑了有关观测的知识和有关实体的知识,这些知识可能是从另一个领域转移过来的。为了区分这种知识转移有利于流分类的情况和有关实体的知识无助于对观察结果进行分类的情况,我们还提出了一种启发式方法,该方法基于k个随机实体(kRE)对子流进行随机采样。我们的学习场景并未得到完全监督:在获取每个实体的最初几个观察值的标签之后,我们假设没有其他标签到达,并尝试从该初始种子中预测近期和远期观察的标签。我们报告来自三个数据集的发现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号