首页> 外文会议>International Conference on Web Information Systems Engineering >An Effective TF/IDF-Based Text-to-Text Semantic Similarity Measure for Text Classification
【24h】

An Effective TF/IDF-Based Text-to-Text Semantic Similarity Measure for Text Classification

机译:基于有效的TF / IDF的文本文本语义相似度,用于文本分类

获取原文

摘要

The use of semantics in tasks related to information retrieval has become, in recent years, a vast field of research. Considering supervised text classification, which is the main interest of this work, semantics can be involved at different steps of text processing: during indexing step, during training step and during class prediction step. As for class prediction step, new text-to-text semantic similarity measures can replace classical similarity measures that are traditionally used by some classification methods for decision-making. In this paper we propose a new measure for assessing semantic similarity between texts based on TF/IDF with a new function that aggregates semantic similarities between concepts representing the compared text documents pair-to-pair. Experimental results demonstrate that our measure outperforms other semantic and classical measures with significant improvements.
机译:近年来,在与信息检索相关的任务中使用语义已经成为了巨大的研究领域。考虑到监督文本分类,这是这项工作的主要兴趣,语义可以涉及文本处理的不同步骤:在索引步骤期间,在训练步骤和课程预测步骤期间。至于类预测步骤,新的文本语义相似度措施可以替换传统上由某些分类方法用于决策的经典相似性测量。在本文中,我们提出了一种新的措施,用于评估基于TF / IDF的文本之间的语义相似性,具有汇总表示比较文档对对对的概念之间的语义相似性的新函数。实验结果表明,我们的衡量越来越优于其他语义和经典措施,具有显着的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号