首页> 外文期刊>Journal of computer sciences >Rule Based Shallow Parser for Arabic Language | Science Publications
【24h】

Rule Based Shallow Parser for Arabic Language | Science Publications

机译:基于规则的阿拉伯语浅解析器|科学出版物

获取原文
       

摘要

> Problem statement: One of language processing approaches that compute a basic analysis of sentence structure rather than attempting full syntactic analysis is shallow syntactic parsing. It is an analysis of a sentence which identifies the constituents (noun groups, verb groups, prepositional groups), but does not specify their internal structure, nor their role in the main sentence. The only technique used for Arabic shallow parser is Support Vector Machine (SVM) based approach. The problem faced by shallow parser developers is the boundary identification which is applied to ensure the generation of high accuracy system performance. Approach: The specific objective of the research was to identify the entire Noun Phrases (NPs), Verb Phrases (VPs) and Prepositional Phrases (PPs) boundaries in the Arabic language. This study discussed various idiosyncrasies of Arabic sentences to derive more accurate rules to detect start and the end boundaries of each clause in an Arabic sentence. New rules were proposed to the shallow parser features up to the generation of two levels from full parse-tree. We described an implementation and evaluate the rule-based shallow parser that handles chunking of Arabic sentences. This research was based on a critical analysis of the Arabic sentences architecture. It discussed various idiosyncrasies of Arabic sentences to derive more accurate rules to detect the start and the end boundaries of each clause in an Arabic sentence. Results: The system was tested manually on 70 Arabic sentences which composed of 1776 words, with the length of the sentences between 4-50 words. The result obtained was significantly better than state of the art Arabic published results, which achieved F-scores of 97%. Conclusion: The main achievement includes the development of Arabic shallow parser based on rule-based approaches. Chunking which constitutes the main contribution is achieved on two successive stages that include grouped sequences of adjacent words on the basis of linguistic properties.
机译: > 问题陈述:一种浅层语法分析是一种语言处理方法,可以对句子结构进行基本分析,而不是尝试进行完整的语法分析。它是对句子的分析,该句子标识了成分(名词组,动词组,介词组),但未指定其内部结构或主句中的角色。用于阿拉伯语浅解析器的唯一技术是基于支持向量机(SVM)的方法。浅层解析器开发人员面临的问题是边界标识,该边界标识用于确保生成高精度系统性能。方法:该研究的特定目标是确定整个名词短语(NP),阿拉伯语中的动词短语(VPs)和介词短语(PPs)边界。这项研究讨论了阿拉伯语句子的各种特质,以得出更准确的规则来检测阿拉伯语句子中每个子句的开始和结束边界。针对浅层解析器功能提出了新规则,直到从完整解析树生成两个级别为止。我们描述了一种实现并评估了基于规则的浅解析器,该浅解析器处理阿拉伯语句子的分块。这项研究基于对阿拉伯文句子结构的批判性分析。它讨论了阿拉伯语句子的各种特质,以得出更准确的规则来检测阿拉伯语句子中每个子句的开始和结束边界。结果:该系统在70个由1776年组成的阿拉伯语句子中进行了手动测试个单词,句子的长度在4至50个单词之间。获得的结果明显优于最新的阿拉伯文出版结果,后者的F分数达到97%。 结论:主要成就包括基于规则方法的阿拉伯语浅层解析器的开发。构成主要贡献的分块是在两个连续的阶段中实现的,该阶段包括基于语言特性的相邻单词的分组序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号