首页> 外文会议>Pattern Recognition and Machine Intelligence >A Hidden Markov Model Based Named Entity Recognition System: Bengali and Hindi as Case Studies

【24h】

A Hidden Markov Model Based Named Entity Recognition System: Bengali and Hindi as Case Studies

机译：基于隐马尔可夫模型的命名实体识别系统：以孟加拉语和北印度语为例

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Named Entity Recognition (NER) has an important role in almost all Natural Language Processing (NLP) application areas including information retrieval, machine translation, question-answering system, automatic summarization etc. This paper reports about the development of a statistical Hidden Markov Model (HMM) based NER system. The system is initially developed for Bengali using a tagged Bengali news corpus, developed from the archive of a leading Bengali newspaper available in the web. The system is trained with a training corpus of 150,000 wordforms, initially tagged with a HMM based part of speech (POS) tagger. Evaluation results of the 10-fold cross validation test yield an average Recall, Precision and F-Score values of 90.2%, 79.48% and 84.5%, respectively. This HMM based NER system is then trained and tested on the Hindi data to show its effectiveness towards the language independent abilities. Experimental results of the 10-fold cross validation test has demonstrated the average Recall, Precision and F-Score values of 82.5%, 74.6% and 78.35%, respectively with 27,151 Hindi wordforms.

机译：命名实体识别（NER）在几乎所有自然语言处理（NLP）应用领域中都扮演着重要角色，包括信息检索，机器翻译，问题解答系统，自动摘要等。本文报告了统计隐马尔可夫模型的发展（基于HMM的NER系统。该系统最初是使用标记的孟加拉语新闻语料库为孟加拉语开发的，该语料库是从网络上领先的孟加拉语报纸的档案库中开发的。该系统使用150,000个字形的训练语料库进行了训练，最初使用基于HMM的词性（POS）标记器进行标记。 10倍交叉验证测试的评估结果分别得出召回率，精确度和F分数的平均值分别为90.2％，79.48％和84.5％。然后，对基于HMM的NER系统进行训练，并在北印度语数据上进行测试，以显示其对独立于语言的能力的有效性。 10倍交叉验证测试的实验结果表明，使用27,151种印地语字形时，平均Recall，Precision和F-Score值分别为82.5％，74.6％和78.35％。

著录项

来源
《Pattern Recognition and Machine Intelligence 》|2007年|545-552|共8页
会议地点 Kolkata(IN);Kolkata(IN)
作者
Asif Ekbal; Sivaji Bandyopadhyay;
展开▼
作者单位

Computer Science and Engineering Department, Jadavpur University, Kolkata, India;

Computer Science and Engineering Department, Jadavpur University, Kolkata, India;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机网络 ;
关键词
named entity (NE); named entity recognition (NER); hidden markov model (HMM); named entity recognition in bengali;

机译：命名实体（NE）;命名实体识别（NER）;隐马尔可夫模型（HMM）;孟加拉语中的命名实体识别;

相似文献

外文文献
中文文献
专利

1. A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition [J] . Sujan Kumar Saha, Pabitra Mitra, Sudeshna Sarkar Knowledge-Based Systems . 2012 ,第期

机译：印地语和孟加拉语实体识别中特征约简方法的比较研究
2. A Conditional Random Field Approach for Named Entity Recognition in Bengali and Hindi [J] . Asif Ekbal, Sivaji Bandyopadhyay Linguistic Issues in Language Technology . 2009 ,第1期

机译：孟加拉语和北印度语中命名实体识别的条件随机场方法
3. A deep neural network-based model for named entity recognition for Hindi language [J] . Sharma Richa, Morwal Sudha, Agarwal Basant, Neural computing & applications . 2020 ,第20期

机译：基于深度神经网络的印地语语言名称实体识别模型
4. Named Entity Recognition in Bengali Text Using Merged Hidden Markov Model and Rule Base Approach [C] . Mah Dian Drovo, Moithri Chowdhury, Saiful Islam Uday, International Conference on Smart Computing Communications . 2019

机译：融合隐马尔可夫模型和规则库的孟加拉文本命名实体识别
5. Adaptive systems for hidden Markov model-based pattern recognition systems. [D] . Cavalin, Paulo Rodrigo. 2011

机译：用于基于隐马尔可夫模型的模式识别系统的自适应系统。
6. TaggerOne: joint named entity recognition and normalization with semi-Markov Models [O] . Robert Leaman, Zhiyong Lu -1

机译：TaggerOne：使用半马尔可夫模型进行的联合命名实体识别和规范化
7. Product Named Entity Recognition Based on Hierarchical Hidden Markov Model ∗ [O] . Feifan Liu, Jun Zhao, Bibo Lv, 2013

机译：基于分层隐马尔可夫模型的产品命名实体识别

A Hidden Markov Model Based Named Entity Recognition System: Bengali and Hindi as Case Studies

摘要

著录项

相似文献

相关主题

期刊订阅