首页> 外文期刊>ACM transactions on Asian language information processing >RENAR: A Rule-Based Arabic Named Entity Recognition System
【24h】

RENAR: A Rule-Based Arabic Named Entity Recognition System

机译:RENAR:基于规则的阿拉伯命名实体识别系统

获取原文
获取原文并翻译 | 示例

摘要

Named entity recognition has served many natural language processing tasks such as information retrieval, machine translation, and question answering systems. Many researchers have addressed the name identification issue in a variety of languages and recently some research efforts have started to focus on named entity recognition for the Arabic language. We present a working Arabic information extraction (IE) system that is used to analyze large volumes of news texts every day to extract the named entity (NE) types person, organization, location, date, and number, as well as quotations (direct reported speech) by and about people. The named entity recognition (NER) system was not developed for Arabic, but instead a multilingual NER system was adapted to also cover Arabic. The Semitic language Arabic substantially differs from the Indo-European and Finno-Ugric languages currently covered. This article thus describes what Arabic language-specific resources had to be developed and what changes needed to be made to the rule set in order to be applicable to the Arabic language. The achieved evaluation results are generally satisfactory, but could be improved for certain entity types.
机译:命名实体识别已服务于许多自然语言处理任务,例如信息检索,机器翻译和问题解答系统。许多研究人员已经以多种语言解决了名称识别问题,并且最近一些研究工作已开始集中于阿拉伯语言的命名实体识别。我们提供了一个有效的阿拉伯语信息提取(IE)系统,该系统每天用于分析大量新闻文本,以提取命名实体(NE)类型的人,组织,位置,日期和编号以及引号(直接报道)演讲)。命名实体识别(NER)系统不是为阿拉伯语开发的,而是改用了多语言的NER系统,以涵盖阿拉伯语。闪族语阿拉伯语与目前覆盖的印欧语和芬诺语-乌里克语有很大不同。因此,本文描述了必须开发哪些特定于阿拉伯语言的资源,以及需要对规则集进行哪些更改才能适用于阿拉伯语言。所获得的评估结果总体上令人满意,但是对于某些实体类型可以进行改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号