首页> 外文期刊>ACM transactions on Asian language information processing >On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges
【24h】

On the Usage of a Classical Arabic Corpus as a Language Resource: Related Research and Key Challenges

机译:古典阿拉伯语语料库作为语言资源的应用:相关研究和主要挑战

获取原文
获取原文并翻译 | 示例
       

摘要

This article presents a literature review of computer-science-related research applied on hadith, a kind of Arabic narration which appeared in the 7th century. We study and compare existent works in several fields of Natural Language Processing (NLP), Information Retrieval (IR), and Knowledge Extraction (KE). Thus, we illicit their main drawbacks and identify some perspectives, which may be considered by the research community. We also study the characteristics of these types of documents, by enumerating the advantages/limits of using hadith as a language resource. Moreover, our study shows that previous studies used different collections of hadiths, thus making it hard to compare their results objectively. Besides, many preprocessing steps are recurrent through these applications, thus wasting a lot of time. Consequently, the key issues for building generic language resources from hadiths are discussed, taking into account the relevance of related literature and the wide community of researchers that are interested in these narrations. The ultimate goal is to structure hadith books for multiple usages, thus building common collections which may be exploited in future applications.
机译:本文介绍了对计算机科学相关研究的文献综述,该研究应用于圣训,这是一种出现在7世纪的阿拉伯叙述。我们研究和比较自然语言处理(NLP),信息检索(IR)和知识提取(KE)几个领域中的现有作品。因此,我们认为它们的主要弊端是非法的,并确定了一些观点,研究团体可能会考虑这些观点。我们还通过列举使用圣训作为语言资源的优点/局限性,研究了这类文档的特征。此外,我们的研究表明,先前的研究使用了不同的圣训集合,因此很难客观地比较其结果。此外,在这些应用程序中重复了许多预处理步骤,从而浪费了大量时间。因此,讨论了从圣训建立通用语言资源的关键问题,同时考虑了相关文献的相关性以及对这些叙述感兴趣的广泛研究人员群体。最终目标是为多种用途构造圣训书籍,从而建立可以在将来的应用程序中利用的通用集合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号