首页> 外文期刊>Knowledge-Based Systems >A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition
【24h】

A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition

机译:印地语和孟加拉语实体识别中特征约简方法的比较研究

获取原文
获取原文并翻译 | 示例
           

摘要

Features used for named entity recognition (NER) are often high dimensional in nature. These cause over fitting when training data is not sufficient. Dimensionality reduction leads to performance enhancement in such situations. There are a number of approaches for dimensionality reduction based on feature selec tion and feature extraction. In this paper we perform a comprehensive and comparative study on differ ent dimensionality reduction approaches applied to the NER task. To compare the performance of the various approaches we consider two Indian languages namely Hindi and Bengali. NER accuracies achieved in these languages are comparatively poor as yet, primarily due to scarcity of annotated corpus. For both the languages dimensionality reduction is found to improve performance of the classifiers. A Comparative study of the effectiveness of several dimensionality reduction techniques is presented in detail in this paper.
机译:本质上,用于命名实体识别(NER)的功能通常是高维的。当训练数据不足时,这些会导致过度拟合。在这种情况下,降维可提高性能。有很多基于特征选择和特征提取的降维方法。在本文中,我们对应用于NER任务的不同实体降维方法进行了全面的比较研究。为了比较各种方法的效果,我们考虑了两种印度语言,即印地语和孟加拉语。迄今为止,使用这些语言获得的NER准确性还相对较差,这主要是由于带注释的语料库不足。对于这两种语言,都发现降维可以提高分类器的性能。本文详细介绍了几种降维技术有效性的比较研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号