【24h】

Capturing the Common Syntactical Rules for the Holy Quran: A Data Mining Approach

机译:捕获《古兰经》的常用句法规则:一种数据挖掘方法

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a novel approach to capture the common syntactical rules for the Holy Quran . By syntactical rules, we mean the common relationships between the words' tags that highly show up in the Quran. Arabic, like other language, has a number of tags which include nouns, verbs, and pronouns with a number of sub-types of each one of them. In this paper we used data mining approach to extract the common syntactical rules which will be offered to the natural language processing applications. Stanford part of speech tagger (29 tags) will be used to tag the Quran words. Then, the data mining too called WEKA (PredictiveApriori algorithm) will be used to find the famous syntactical rules. The extracted syntactical rules have a property that it is not necessary to have adjacent words tags. That is, long distance relation. The most common syntactical rule found is: tag1=RP tag2=NN tag3=WP 91 ==> tag4=VBD 90 acc:(0.97912)Which can be seen in the Quran sentence: ?? ???? ????? ????? . This phrase ?? ???? ????? ???? (which is part of an ayah) appeared in 89 ayahs in 20 different surahs; the study used Mushaf Al-Madinah Al- Munawwarah (published by the King Fahd Complex for Printing the Holy Quran ).
机译:本文提出了一种新颖的方法来捕捉“古兰经”的常见句法规则。通过句法规则,我们指的是古兰经中高度显示的单词标记之间的公共关系。阿拉伯语与其他语言一样,具有许多标签,其中包括名词,动词和代词,每个名词都有许多子类型。在本文中,我们使用数据挖掘方法来提取通用的句法规则,这些规则将提供给自然语言处理应用程序。 Stanford语音标记器的一部分(29个标记)将用于标记古兰经词。然后,也将使用称为WEKA(PredictiveApriori算法)的数据挖掘来找到著名的句法规则。提取的句法规则具有不必具有相邻单词标签的特性。即长距离关系。找到的最常见的语法规则是:tag1 = RP tag2 = NN tag3 = WP 91 ==> tag4 = VBD 90 acc:(0.97912)在古兰经中可以看到以下内容: ???? ?????? ?????? 。这句话?? ???? ?????? ???? (是ayah的一部分)出现在20个不同的古兰经中的89 ayahs中;该研究使用了穆沙拉夫·麦迪纳·穆那瓦拉(Mushaf Al-Madinah Al-Munawwarah)(由法赫德国王情结出版,印刷《古兰经》)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号