首页> 外文会议>International conference on web information systems engineering >An Effective TF/IDF-Based Text-to-Text Semantic Similarity Measure for Text Classification
【24h】

An Effective TF/IDF-Based Text-to-Text Semantic Similarity Measure for Text Classification

机译:一种有效的基于TF / IDF的文本到文本语义相似度度量用于文本分类

获取原文

摘要

The use of semantics in tasks related to information retrieval has become, in recent years, a vast field of research. Considering supervised text classification, which is the main interest of this work, semantics can be involved at different steps of text processing: during indexing step, during training step and during class prediction step. As for class prediction step, new text-to-text semantic similarity measures can replace classical similarity measures that are traditionally used by some classification methods for decision-making. In this paper we propose a new measure for assessing semantic similarity between texts based on TF/IDF with a new function that aggregates semantic similarities between concepts representing the compared text documents pair-to-pair. Experimental results demonstrate that our measure outperforms other semantic and classical measures with significant improvements.
机译:近年来,在与信息检索相关的任务中使用语义已成为一个广阔的研究领域。考虑到监督文本分类是这项工作的主要目的,语义可以涉及文本处理的不同步骤:在索引步骤,训练步骤和班级预测步骤中。至于类预测步骤,新的文本到文本语义相似性度量可以代替一些分类方法传统上用于决策的经典相似性度量。在本文中,我们提出了一种新的方法,该方法用于评估基于TF / IDF的文本之间的语义相似性,该新功能可以汇总表示成对文本的比较文本文档的概念之间的语义相似性。实验结果表明,我们的方法在显着改进方面优于其他语义和经典方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号