首页> 外文期刊>The Arabian journal for science and engineering >PARSING ARABIC TEXTS USING REAL PATTERNS OF SYNTACTIC TREES
【24h】

PARSING ARABIC TEXTS USING REAL PATTERNS OF SYNTACTIC TREES

机译:使用句法树的真实模式解析阿拉伯语文本

获取原文
获取原文并翻译 | 示例
           

摘要

In order to parse Arabic texts, we have chosen to use a machine learning approach. It learns from an Arabic Treebank. The knowledge enclosed in this Treebank is structured as patterns of syntactic trees. These patterns are representative models of the Arabic syntactic components. They are both layered and rich structurally and contextually. They serve as an informational source for guiding the parsing process. Our parser is progressive since it proceeded by treating a sentence into a number of stages equal to the number of its words. At every step, the parser affects the target word with the most likely patterns that represent it in the context where it is put. Then, it joins the selected patterns with those collected in the previous parsing steps in order to construct the representative syntactic tree(s) of the whole sentence. If more than one tree is proposed, all the analysis trees are sorted according to their appearance frequencies in the Treebank. The preliminary tests have yielded accuracy and f-score equal to 84.8% and 77.5%, respectively.
机译:为了解析阿拉伯文本,我们选择使用机器学习方法。它从阿拉伯树库学习。树库中包含的知识被构造为语法树的模式。这些模式是阿拉伯语语法成分的代表性模型。它们在结构上和上下文上都是分层且丰富的。它们充当指导解析过程的信息来源。我们的解析器是渐进式的,因为它通过将一个句子分为与其词数相等的多个阶段来进行。在每个步骤中,解析器都会以最有可能在其放置上下文中表示目标词的模式来影响目标词。然后,它将所选模式与在先前的解析步骤中收集的模式结合起来,以构建整个句子的代表性语法树。如果提出了不止一棵树,则将根据所有分析树在树库中的出现频率进行排序。初步测试得出的准确度和f得分分别等于84.8%和77.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号