首页> 外文期刊>Engineering Applications of Artificial Intelligence >A semi-explicit short text retrieval method combining Wikipedia features
【24h】

A semi-explicit short text retrieval method combining Wikipedia features

机译:组合维基百科功能的半显式短文本检索方法

获取原文
获取原文并翻译 | 示例
           

摘要

With the advantages such as openness, interactivity, immediacy, and simplicity, the large number of short text data appear in the Web information space. Considering the short length, little information, sparse features and irregular grammar, the traditional information analyzing and retrieval technologies cannot deal with short text effectively. In view of the above problems, in this paper a new short text retrieval method based on the current mainstream semantic knowledge source, Wikipedia, is proposed. To be specific, a semantic feature selection algorithm is proposed to return the top k most relevant Wikipedia concepts as the whole vector space for a given short text. Thus, by analyzing the topic information of the semantic features contained in Wikipedia concepts, we propose some formulas to determine the association coefficient list between different components of the corresponding positions in two different feature vectors. On this basis, a new semantic relatedness assessment method under this lower dimensional semantic space is designed. According to computing and sorting the semantic relatedness between user queries and the target short text, a novel semi-explicit short text retrieval method combining Wikipedia concept feature and the corresponding topic information is proposed. Lastly, based on the experimental results on twitter subsets, we verify that our proposal has advantages over other some current retrieval methods on MAP, P@k and R-Prec, and can return more valid results.
机译:具有诸如开放性,交互性,即时性和简单性等优点,在Web信息空间中出现了大量的短文本数据。考虑到短长,信息,稀疏特征和不规则语法,传统信息分析和检索技术无法有效处理短文本。鉴于上述问题,提出了一种基于当前主流语义知识来源维基百科的新的短文本检索方法。具体而言,提出了一种语义特征选择算法,以将顶部K最相关的维基百科概念作为给定的简短文本的整个矢量空间返回。因此,通过分析维基百科概念中包含的语义特征的主题信息,我们提出了一些公式来确定两个不同特征向量中的相应位置的不同组件之间的关联系数列表。在此基础上,设计了在这个较低维语义空间下的新的语义相关性评估方法。根据计算和对用户查询和目标短文本之间的语义相关性进行分类,提出了组合维基百科概念特征的新型半显式短文本检索方法和相应的主题信息。最后,基于Twitter子集的实验结果,我们验证了我们的提案在地图中的其他一些当前检索方法中有优势,P @ K和R-PREV,并且可以返回更有效的结果。

著录项

  • 来源
    《Engineering Applications of Artificial Intelligence》 |2020年第9期|103809.1-103809.12|共12页
  • 作者单位

    Software Engineering College Zhengzhou University of Light Industry Zhengzhou 450000 China;

    Software Engineering College Zhengzhou University of Light Industry Zhengzhou 450000 China;

    Software Engineering College Zhengzhou University of Light Industry Zhengzhou 450000 China;

    Software Engineering College Zhengzhou University of Light Industry Zhengzhou 450000 China;

    School of Computer Science South China Normal University Guangzhou 510631 China;

    School of Computer Science South China Normal University Guangzhou 510631 China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Semi-explicit semantic; Feature selection; Short text retrieval; Wikipedia;

    机译:半显式语义;特征选择;短文本检索;维基百科;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号