首页> 外文期刊>International journal of enterprise network management >A document similarity approach using grammatical linkages with graph databases
【24h】

A document similarity approach using grammatical linkages with graph databases

机译:使用语法链接和图数据库的文档相似性方法

获取原文
获取原文并翻译 | 示例
       

摘要

Document similarity had become essential in many applications such as document retrieval, recommendation systems, plagiarism checker, etc. Many similarity evaluation approaches rely on word-based document representation, because it is very fast. But these approaches are not accurate when documents with different language and vocabulary are used. When graph representation is used for documents they use some relational knowledge which is not feasible in many applications because of expensive graph operations. In this work a novel approach for document similarity computation which utilises verbal intent has been developed. This improves the similarity by increasing the number of linkages using verbs between two documents. Graph databases were used for faster performance. The performance of the system is evaluated using various metrics like cosine similarity, jaccard similarity and dice with different review datasets. The verbal intent-based approach has registered promising results based on the links between two documents.
机译:文档相似度已在许多应用程序中变得至关重要,例如文档检索,推荐系统,抄袭检查器等。许多相似度评估方法都依靠基于单词的文档表示,因为它非常快。但是,当使用具有不同语言和词汇的文档时,这些方法并不准确。当图形表示用于文档时,它们使用一些关系知识,由于昂贵的图形操作,在许多应用程序中这是不可行的。在这项工作中,已经开发了一种利用言语意图进行文档相似度计算的新方法。通过使用两个文档之间的动词增加链接的数量,可以提高相似性。图形数据库用于提高性能。使用诸如余弦相似度,jaccard相似度和具有不同评论数据集的骰子之类的各种指标来评估系统的性能。基于口头意图的方法已经基于两个文档之间的链接记录了可喜的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号