首页> 外文会议>International Conference on Fuzzy Systems and Knowledge Discovery >A Hybrid Statistical Language Model Applied to the Domain Specific Information Retrieval
【24h】

A Hybrid Statistical Language Model Applied to the Domain Specific Information Retrieval

机译:应用于域特定信息检索的混合统计语言模型

获取原文

摘要

The traditional language model takes the multi-topics document corpus as the research target. In order to avoid the interference brought by the multi-topics problem, this paper focuses on the domain specific Information Retrieval (IR). In domain specific IR, different terms are considered to take different contribution degrees to the final query result. So the terms in a document can be divided into different categories according to their contribution degrees. And the statistical information of a term, mainly its probabilities, is computed by different methods and smooth strategies according to its category. This paper proposed an improved hybrid statistical language model used in the Domain Specific IR. This new model has about 9%~10% performance increment in the experimental result. In the end, some challenges and research orientation of the statistical language model research are presented.
机译:传统的语言模型将多主题文档语料库作为研究目标。为了避免多主题问题带来的干扰,本文重点介绍了域特定信息检索(IR)。在域的特定IR中,不同的术语被认为将不同的贡献度带到最终查询结果。因此,文件中的术语可根据其贡献度分为不同的类别。并且术语的统计信息主要是其概率,通过根据其类别的不同方法和平滑策略来计算。本文提出了一种在域特定IR中使用的改进的混合统计语言模型。这种新模型在实验结果中具有大约9%〜10%的性能增量。最后,介绍了统计语言模型研究的一些挑战和研究方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号