首页> 外文会议>International Conference on Web Information Systems Engineering >A Dynamic-Static Approach of Model Fusion for Document Similarity Computation
【24h】

A Dynamic-Static Approach of Model Fusion for Document Similarity Computation

机译:文档相似性计算模型融合的动态静态方法

获取原文

摘要

The semantic similarity of text document pairs can be used for valuable applications. There are various existing basic models proposed for representing document content and computing document similarity. Each basic model performs difference in different scenarios. Existing model selection or fusion approaches generate improved models based on these basic models on the granularity of document collection. These improved models are static for all document pairs and may be only proper for some of the document pairs. We propose a dynamic idea of model fusion, and an approach based on a Dynamic-Static Fusion Model (DSFM) on the granularity of document pairs, which is dynamic for each document pair. The dynamic module in DSFM learns to rank the basic models to predict the best basic model for a given document pair. We propose a model categorization method to construct ideal model labels of document pairs for learning in this dynamic module. The static module in DSFM is based on linear regression. We also propose a model selection method to select appropriate candidate basic models for fusion and improve the performance. The experiments on public document collections which contain paragraph pairs and sentence pairs with human-rated similarity illustrate the effectiveness of our approach.
机译:文本文档对的语义相似性可用于有价值的应用程序。有各种现有的基本模型,用于表示文档内容和计算文档相似度。每个基本模型对不同方案的差异进行差异。现有的模型选择或融合方法基于这些基本模型生成改进的模型上的文档收集粒度。这些改进的模型对于所有文档对都是静态的,并且可能只适用于某些文档对。我们提出了模型融合的动态思想,以及基于动态静态融合模型(DSFM)的方法对文档对的粒度,这对于每个文档对是动态的。 DSFM中的动态模块学习为基本模型进行排名以预测给定文档对的最佳基本模型。我们提出了一种模型分类方法来构建文献对的理想模型标签,用于在该动态模块中学习。 DSFM中的静态模块基于线性回归。我们还提出了一种模型选择方法,可选择合适的候选基本模型进行融合并提高性能。包含具有人额定相似性段对和句子对的公共文件集合的实验说明了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号