Named Entity Recognition in Urdu: A Progress Report

机译：乌尔都语中的命名实体识别：进度报告

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We are interested in properly identifying named entities in Hindi and Urdu text for natural language processing purposes, including information extraction. We explore two approaches to processing Hindi and Urdu text and designing named entity recognition algorithms. In the first approach, using the Unicode character set, we consider processing Hindi in Devanagari script and Urdu in Arabic script instead of transcribing the languages to a Roman-based script. In the second approach, we consider transcribing the Hindi and Urdu text to a common script, the International Phonetic Alphabet (IPA). As part of the project, we built an Urdu corpus marked up with an Extensible Markup Language (XML). We consider both statistical-based and rule-based approaches to the named entity recognition algorithms.

机译：我们对正确识别印地语和乌尔都语文本中的命名实体感兴趣，以便进行自然语言处理，包括信息提取。我们探索了两种处理印地语和乌尔都语文本以及设计命名实体识别算法的方法。在第一种方法中，使用Unicode字符集，我们考虑在Devanagari脚本中处理印地语，在阿拉伯语脚本中处理Urdu，而不是将语言转录为基于罗马的脚本。在第二种方法中，我们考虑将印地语和乌尔都语文本转录为通用脚本国际音标（IPA）。作为该项目的一部分，我们构建了一个用可扩展标记语言（XML）标记的Urdu语料库。我们考虑基于统计和基于规则的方法来命名实体识别算法。

著录项

来源
《International Conference on Internet Computing IC'02 Vol.3, Jun 24-27, 2002, Las Vegas, Nevada, USA》|2002年|p.757-761|共5页
会议地点 Las Vegas NV(US);Las Vegas NV(US)
作者
D. Becker; K. Riaz; B. Bennett; E. Davis; D. Panton;
展开▼
作者单位

Program in Software Engineering University of St. Thomas St. Paul, MN, U.S.A.;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
named entity recognition; information extraction; unicode; hindi; urdu;

机译：命名实体识别；信息提取； unicode;印地语;乌尔都语;
入库时间 2022-08-26 14:15:13

相似文献

外文文献
中文文献
专利

1. Urdu Named Entity Recognition: Corpus Generation and Deep Learning Applications [J] . Kanwal Safia, Malik Kamran, Shahzad Khurram, ACM transactions on Asian language information processing . 2020,第1期

机译：乌尔都语命名实体识别：语料库生成和深度学习应用
2. Deep recurrent neural networks with word embeddings for Urdu named entity recognition [J] . Wahab Khan, Ali Daud, Fahd Alotaibi, ETRI journal . 2020,第1期

机译：具有Word Embeddings的深度经常性神经网络，用于URDU命名实体识别
3. Urdu Named Entity Recognition and Classification System Using Artificial Neural Network [J] . MUHAMMAD KAMRAN MALIK ACM transactions on Asian language information processing . 2018,第1期

机译：基于人工神经网络的乌尔都语命名实体识别与分类系统
4. Named Entity Recognition in Urdu: A Progress Report [C] . D. Becker, K. Riaz, B. Bennett, International conference on internet computing . 2002

机译：在Urdu中命名实体识别：进度报告
5. Improving Search via Named Entity Recognition in Morphologically Rich Languages: A Case Study in Urdu [D] . Riaz, Kashif H. 2018

机译：通过形态丰富的语言中的命名实体识别来改善搜索：以乌尔都语为例
6. De-identifying Spanish medical texts - named entity recognition applied to radiology reports [O] . Irene Pérez-Díez, Raúl Pérez-Moraga, Adolfo López-Cerdán, 2021

机译：去识别西班牙医学文本 - 命名实体识别适用于放射学报告
7. Named Entity Recognition and Named Entity Linking on Esports Contents [O] . Ziyu Liu, Yifan Leng, Meiqi Wang, 2020

机译：命名实体识别和命名实体链接在esports内容上

Named Entity Recognition in Urdu: A Progress Report

摘要

著录项

相似文献

相关主题

期刊订阅