首页> 外文会议>Proceedings of the Twenty-Third international joint conference on artificial intelligence >Computing Text Semantic Relatedness using the Contents and Links of a Hypertext Encyclopedia:Extended Abstract
【24h】

Computing Text Semantic Relatedness using the Contents and Links of a Hypertext Encyclopedia:Extended Abstract

机译:使用超文本百科全书的内容和链接计算文本语义相关性:扩展摘要

获取原文
获取原文并翻译 | 示例

摘要

We propose methods for computing semantic relatedness between words or texts by using knowledge from hypertext encyclopedias such as Wikipedia.A network of concepts is built by filtering the encyclopedia’s articles,each concept corresponding to an article.A random walk model based on the notion of Visiting Probability (VP) is employed to compute the distance between nodes,and then between sets of nodes.To transfer learning from the network of concepts to text analysis tasks,we develop two common representation approaches.In the first approach,the shared representation space is the set of concepts in the network and every text is represented in this space.In the second approach,a latent space is used as the shared representation,and a transformation from words to the latent space is trained over VP scores.We applied our methods to four important tasks in natural language processing: word similarity,document similarity,document clustering and classification,and ranking in information retrieval.The performance is state-ofthe- art or close to it for each task,thus demonstrating the generality of the proposed knowledge resource and the associated methods.
机译:我们提出了利用维基百科等超文本百科全书的知识来计算单词或文本之间语义相关性的方法。通过过滤百科全书的文章来构建概念网络,每个概念对应于一篇文章。基于来访概念的随机游走模型为了计算节点之间的距离,然后计算节点集之间的距离,我们使用了概率表示法。为了将学习从概念网络转移到文本分析任务中,我们开发了两种常见的表示方法。第一种方法是共享表示空间网络中的概念集和每个文本都在该空间中表示。在第二种方法中,将潜在空间用作共享表示,并通过VP分数训练从单词到潜在空间的转换。自然语言处理中的四个重要任务:单词相似度,文档相似度,文档聚类和分类以及信息检索的排名对于每个任务,其性能都是最先进的或接近于此的,从而证明了所提出的知识资源和相关方法的普遍性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号