Exploiting Knowledge Graph to Improve Text-based Prediction

机译：利用知识图来改进基于文本的预测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As a special kind of "big data," text data can be regarded as data reported by human sensors. Since humans are far more intelligent than physical sensors, text data contains useful information and knowledge about the real world, making it possible to make predictions about real-world phenomena based on text. As all application domains involve humans, text-based prediction has widespread applications, especially for optimization of decision making. While the problem of text-based prediction resembles text classification when formulated as a supervised learning problem, it is more challenging because the variable to be predicted may not be directly derivable from the text and thus there is a semantic gap between the target variable and the surface features that are often used for representing text data in conventional approaches. In this paper, we propose to bridge this gap by using knowledge graph to construct more effective features for text representation. We propose a two-step filtering algorithm to enhance such a knowledge-aware text representation for a family of entity-centric text regression tasks where the response variable can be treated as an attribute of a group of central entities. We evaluate the proposed algorithm by using two revenue prediction tasks based on reviews. The results show that the proposed algorithm can effectively leverage knowledge graphs to construct interpretable features, leading to significant improvement of the prediction accuracy over traditional features.

机译：作为一种特殊的“大数据”，文本数据可视为人类传感器报告的数据。由于人类比物理传感器要聪明得多，因此文本数据包含有关现实世界的有用信息和知识，从而可以基于文本对现实世界的现象进行预测。由于所有应用领域都涉及人类，因此基于文本的预测具有广泛的应用，尤其是在决策优化方面。虽然基于文本的预测问题在公式化为监督学习问题时类似于文本分类，但更具挑战性，因为要预测的变量可能无法直接从文本派生，因此目标变量和目标变量之间存在语义鸿沟在传统方法中通常用于表示文本数据的表面特征。在本文中，我们建议通过使用知识图构建更有效的文本表示功能来弥合这种差距。我们提出了两步过滤算法，以增强这种以知识为中心的文本表示形式，以一系列以实体为中心的文本回归任务，其中响应变量可以视为一组中央实体的属性。我们通过使用基于评论的两个收入预测任务来评估提出的算法。结果表明，该算法可以有效地利用知识图谱来构造可解释的特征，从而大大提高了预测精度。

著录项

来源
《IEEE International Conference on Big Data》|2018年|1407-1416|共10页
会议地点
作者
Shan Jiang; Chengxiang Zhai; Qiaozhu Mei;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Motion pictures; Feature extraction; Semantics; Bridges; Prediction algorithms; Data mining;

机译：任务分析;运动图像;特征提取;语义;桥梁;预测算法;数据挖掘;

相似文献

外文文献
中文文献
专利

1. Improving Anomaly Detection for Text-Based Protocols by Exploiting Message Structures [J] . Christian M. Mueller, Jochen K#xF6, gel, Future Internet . 2010,第4期

机译：通过利用消息结构来改进基于文本的协议的异常检测
2. Knowledge of predator-prey interactions improves predictions of immigration and extinction in island biogeography [J] . Cirtwill Alyssa R., Stouffer Daniel B. Global ecology and biogeography . 2016,第7a8期

机译：捕食者与猎物相互作用的知识改善了岛屿生物地理学中移民和灭绝的预测
3. Knowledge of predator-prey interactions improves predictions of immigration and extinction in island biogeography [J] . Global ecology and biogeography . 2016,第7a8期

机译：捕食者与猎物相互作用的知识改善了岛屿生物地理学中移民和灭绝的预测
4. Exploiting Knowledge Graph to Improve Text-based Prediction [C] . Shan Jiang, Chengxiang Zhai, Qiaozhu Mei IEEE International Conference on Big Data . 2018

机译：利用知识图以改善基于文本的预测
5. Is geography knowledge improving? A study of current geographic knowledge among United States college geography students. [D] . Sievertson, Michelle D. 2005

机译：地理知识是否在进步？美国大学地理专业学生当前地理知识的研究。
6. Exploiting Semantic Patterns over Biomedical Knowledge Graphs for Predicting Treatment and Causative Relations [O] . Gokhan Bakal, Preetham Talari, Elijah V. Kakani, -1

机译：利用生物医学知识图上的语义模式预测治疗和因果关系
7. Improving Anomaly Detection for Text-Based Protocols by Exploiting Message Structures [O] . Christian M. Mueller, Martin Güthle, Jochen Kögel, 2010

机译：利用消息结构改进基于文本协议的异常检测
8. Subsidence Prediction and Control. Phase I. The State of Knowledge in Poland Concerning the Influence of Mining Exploitation on the Surface. Final Report [R] . Skinderowicz, B. 1978

机译：沉降预测与控制。第一阶段波兰关于采矿开采对地表影响的知识状况。总结报告

Exploiting Knowledge Graph to Improve Text-based Prediction

摘要

著录项

相似文献

相关主题

期刊订阅