首页> 外文会议>Industrial conference on data mining >Mining Semantic Relationships between Concepts across Documents Incorporating Wikipedia Knowledge

【24h】

Mining Semantic Relationships between Concepts across Documents Incorporating Wikipedia Knowledge

机译：结合维基百科知识的文档之间概念之间的语义关系挖掘

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The ongoing astounding growth of text data has created an enormous need for fast and efficient text mining algorithms. Traditional approaches for document representation are mostly based on the Bag of Words (BOW) model which takes a document as an unordered collection of words. However, when applied in fine-grained information discovery tasks, such as mining semantic relationships between concepts, sorely relying on the BOW representation may not be sufficient to identify all potential relationships since the resulting associations based on the BOW approach are limited to the concepts that appear in the document collection literally. In this paper, we attempt to complement existing information in the corpus by proposing a new hybrid approach, which mines semantic associations between concepts across multiple text units through incorporating extensive knowledge from Wikipedia. The experimental evaluation demonstrates that search performance has been significantly enhanced in terms of accuracy and coverage compared with a purely BOW-based approach and alternative solutions where only the article contents of Wikipedia or category information are considered.

机译：文本数据的惊人增长使人们对快速有效的文本挖掘算法产生了巨大的需求。传统的文档表示方法主要基于单词袋（BOW）模型，该模型将文档作为单词的无序集合。但是，当应用于细粒度的信息发现任务（例如，挖掘概念之间的语义关系）时，仅依靠BOW表示可能不足以识别所有潜在关系，因为基于BOW方法的结果关联仅限于以下概念：从字面上出现在文档集合中。在本文中，我们尝试通过提出一种新的混合方法来补充语料库中的现有信息，该方法通过整合来自Wikipedia的广泛知识来挖掘多个文本单元之间的概念之间的语义关联。实验评估表明，与纯粹基于BOW的方法和仅考虑Wikipedia的文章内容或类别信息的替代解决方案相比，搜索性能在准确性和覆盖范围方面得到了显着提高。

著录项

来源
《Industrial conference on data mining》|2013年|70-84|共15页
会议地点
作者
Peng Yan; Wei Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Knowledge Discovery; Semantic Relatedness; Cross-Document knowledge Discovery; Document Representation;

机译：知识发现;语义相关性;跨文档知识发现;文件表示;

相似文献

外文文献
中文文献
专利

1. Semantic Annotation of Documents Based on Wikipedia Concepts | Brank | Informatica [J] . Janez Brank, Gregor Leban, Marko Grobelnik Informatica: An International Journal of Computing and Informatics . 2018,第1期

机译：基于维基百科概念的文档语义注释布兰克|信息学
2. A Bag of Concepts Approach for Biomedical Document Classification Using Wikipedia Knowledge Spanish-English Cross-language Case Study [J] . Mourino-Garcia Marcos A., Perez-Rodriguez Roberto, Anido-Rifon Luis E. Methods of information in medicine . 2017,第5期

机译：使用维基百科知识西班牙语 - 英语跨语言案例研究的生物医学文件分类袋概念方法
3. Automatic Construction Method for Domain Concepts Based on Wikipedia Semantic Knowledge Base [J] . Qiaoyan Zhang, Min Lin, Shujun Zhang Journal of Computer and Communications . 2017,第1期

机译：基于维基百科语义知识库的领域概念自动构建方法
4. Mining Semantic Relationships between Concepts across Documents Incorporating Wikipedia Knowledge [C] . Peng Yan, Wei Jin Industrial Conference on Data Mining . 2013

机译：挖掘跨文档的概念之间的语义关系，包括维基百科知识
5. Mining semantic relationships between concepts across documents using Wikipedia knowledge. [D] . Yan, Peng. 2013

机译：使用Wikipedia知识挖掘文档之间的概念之间的语义关系。
6. Experimental data for computing semantic similarity between concepts using multiple inheritances in Wikipedia category graph [O] . Muhammad Jawad Hussain, Shahbaz Hassan Wasti, Guangjian Huang, 2020

机译：用于在Wikipedia类别图中使用多个继承来计算概念之间的语义相似性的实验数据
7. Dragon Toolkit: Incorporating Auto-learned Semantic Knowledge into Large-Scale Text Retrieval and Mining [O] . Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu 2007

机译：Dragon Toolkit：将自动学习的语义知识融入大规模文本检索和挖掘中

Mining Semantic Relationships between Concepts across Documents Incorporating Wikipedia Knowledge

摘要

著录项

相似文献

相关主题

期刊订阅