首页> 外文会议>2015 First International Conference on Arabic Computational Linguistics >A Named Entities Recognition System for Modern Standard Arabic using Rule-Based Approach
【24h】

A Named Entities Recognition System for Modern Standard Arabic using Rule-Based Approach

机译:基于规则的现代阿拉伯语命名实体识别系统

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Named Entity Recognition (NER) is a task in Information Extraction (IE). The Named Entity Recognition has become very important for Natural Language Processing (NLP). In this paper, we designed a system which enhanced the named entities recognition for Arabic language where the system was developed for Arabic nouns and entities extractions. The nouns extraction system is based on Arabic morphological, the Arabic grammar rules a lot of them are not used before. The noun extraction in the system uses no gazetteers and the system is combined with entities extraction system depending on gazetteers. The system extracts noun according to morphological Arabic and classify them into proper nouns entities, title entities, currency entities, percentage entities, countries entities, cities entities, nationality entities, number entities, places entities, date entities and time entities. The system applied algorithms for generate nationality entities from countries entities, and the system applied Regular Expression (RE) for extract numbers in digit format. The system is not needed to normalization into the text before extraction process. The system tested text that is in the Modern Standard Arabic (MSA), the corpus is in open text. The system achieves results in an average recall of 85%.
机译:命名实体识别(NER)是信息提取(IE)中的一项任务。对于自然语言处理(NLP),命名实体识别已变得非常重要。在本文中,我们设计了一个增强阿拉伯语言命名实体识别的系统,其中该系统是为阿拉伯名词和实体提取而开发的。名词提取系统基于阿拉伯语形态,许多以前没有使用过阿拉伯语语法规则。系统中的名词提取不使用地名词典,并且该系统与根据地名词典的实体提取系统结合在一起。系统根据形态阿拉伯语提取名词,并将其分类为专有名词实体,标题实体,货币实体,百分比实体,国家实体,城市实体,国籍实体,数字实体,地点实体,日期实体和时间实体。系统应用了从国家实体生成国籍实体的算法,并且系统应用了正则表达式(RE)来提取数字格式的数字。在提取过程之前,不需要将系统标准化为文本。系统测试的文本为现代标准阿拉伯语(MSA),语料库为开放文本。该系统的平均召回率为85%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号