首页> 外文期刊>Computer Science & Information Technology >A Topological Method for Comparing Document Semantics
【24h】

A Topological Method for Comparing Document Semantics

机译:一种比较文献语义的拓扑方法

获取原文
获取外文期刊封面目录资料

摘要

Comparing document semantics is one of the toughest tasks in both Natural Language Processing and Information Retrieval. To date, on one hand, the tools for this task are still rare. On the other hand, most relevant methods are devised from the statistic or the vector space model perspectives but nearly none from a topological perspective. In this paper, we hope to make a different sound. A novel algorithm based on topological persistence for comparing semantics similarity between two documents is proposed. Our experiments are conducted on a document dataset with human judges’ results. A collection of state-of-the-art methods are selected for comparison. The experimental results show that our algorithm can produce highly human-consistent results, and also beats most state-of-the-art methods though ties with NLTK.
机译:比较文档语义是自然语言处理和信息检索中最棘手的任务之一。迄今为止,一方面,此任务的工具仍然很少见。另一方面,大多数相关方法都是从统计或矢量空间模型视角设计的,但几乎没有来自拓扑视角。在本文中,我们希望发出不同的声音。提出了一种基于拓扑持久性的新型算法,用于比较两个文档之间的语义相似性。我们的实验是在具有人类法官结果的文件数据集上进行的。选择了一系列最先进的方法进行比较。实验结果表明,我们的算法可以产生高度人类一致的结果,并且虽然与NLTK联系,但也符合最先进的方法。

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号