【24h】

An Improved Approach to Bengali Keyphrase Extraction

机译:孟加拉关键酶提取的改进方法

获取原文

摘要

This paper presents a new approach for automatically extracting key phrases from a Bengali document. Our proposed approach presented in this paper has two important steps: (1) a shallow parsing based candidate key phrase identification that uses lexical information and case markers for candidate key phrase identification and (2) choosing the best items from the set of the candidates using a ranking method that combines the statistical features and the linguistic features for ranking the candidates. The feature set includes term frequency, position of the phrase's first occurrence, named entity information and lexical information. The proposed system has been tested on a collection of Bengali news documents. The experimental results show that it performs better than the existing approaches to which it is compared.
机译:本文介绍了一种自动从孟加拉文档中提取关键短语的新方法。我们本文提出的建议方法有两个重要步骤:(1)基于浅析的候选密钥短语识别,用于候选密钥短语识别的词汇信息和案例标记,并且使用(2)使用候选人集中的最佳项目一种组合统计特征和语言特征来排名候选的排名方法。该特征集包括术语频率,短语第一出现的位置,命名实体信息和词法信息。拟议的系统已经在孟加拉新闻文件的集合上进行了测试。实验结果表明,它比比较的现有方法更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号