首页> 外文期刊>ACM transactions on Asian language information processing >Urdu Named Entity Recognition and Classification System Using Artificial Neural Network
【24h】

Urdu Named Entity Recognition and Classification System Using Artificial Neural Network

机译:基于人工神经网络的乌尔都语命名实体识别与分类系统

获取原文
获取原文并翻译 | 示例
       

摘要

Named Entity Recognition and Classification (NERC) is a process of identifying words and classifying them into person names, location names, organization names, and so on. In this article, we discuss the development of an Urdu Named Entity (NE) corpus, called the Kamran-PU-NE (KPU-NE) corpus, for three entity types, that is, Person, Organization, and Location, and marking the remaining tokens as Others (O). We use two supervised learning algorithms, Hidden Markov Model (HMM) and Artificial Neural Network (ANN), for the development of the Urdu NERC system. We annotate the 652852-token corpus taken from 15 different genres with a total of 44480 NEs. The inter-annotator agreement between the two annotators in terms of Kappa k statistic is 73.41%. With HMM, the highest recorded precision, recall, and f-measure values are 55.98%, 83.11%, and 66.90%, respectively, and with ANN, they are 81.05%, 87.54%, and 84.17%, respectively.
机译:命名实体识别和分类(NERC)是识别单词并将其分类为人员名称,位置名称,组织名称等的过程。在本文中,我们讨论了针对三种实体类型(人,组织和位置)的乌尔都语命名实体(NE)语料库(称为Kamran-PU-NE(KPU-NE)语料库)的开发,并标记了其余标记为其他(O)。我们使用两种监督学习算法,即隐马尔可夫模型(HMM)和人工神经网络(ANN),来开发Urdu NERC系统。我们注释了来自15种不同流派的652852令牌语料,总共有44480个NE。根据Kappa k统计,两个注释者之间的注释者之间的一致性为73.41%。使用HMM时,记录的最高精度,召回率和f测量值分别为55.98%,83.11%和66.90%,而使用ANN时,分别为81.05%,87.54%和84.17%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号