首页> 外文期刊>Current Organic Synthesis >Direct Text Classifier for Thematic Arabic Discourse Documents
【24h】

Direct Text Classifier for Thematic Arabic Discourse Documents

机译:用于主题阿拉伯语话语文件的直接文本分类器

获取原文
获取原文并翻译 | 示例
           

摘要

Maintaining the topical coherence while writing a discourse is a major challenge confronting novice and non-novice writers alike. This challenge is even more intense with Arabic discourse because of the complex morphology and the widespread of synonyms in Arabic language. In this research, we present a direct classification of Arabic discourse document while writing. This prescriptive proposed framework consists of the following stages: data collection, pre-processing, construction of Language Model (LM), topics identification, topics classification, and topic notification. To prove and demonstrate our proposed framework, we designed a system and applied it on a corpus of 2800 Arabic discourse documents synthesized into four predefined topics related to: Culture, Economy, Sport, and Religion. System performance was analysed, in terms of accuracy, recall, precision, and F-measure. The results demonstrated that the proposed topic modeling-based decision framework is able to classify topics while writing a discourse with accuracy of 91.0%.
机译:在撰写话语时保持局部连贯性是新手和非新手作家相同的主要挑战。由于复杂的形态和阿拉伯语中的同义词,这一挑战更加激烈。在这项研究中,我们在写作时展示了阿拉伯语话语文件的直接分类。此规定的提议框架包括以下阶段:数据收集,预处理,语言模型(LM),主题识别,主题分类和主题通知。为了证明并展示我们所提出的框架,我们设计了一个系统并将其应用于2800名阿拉伯语话语文件的语料库中,综合为与:文化,经济,体育和宗教有关的四个预定义主题。在准确性,召回,精度和F测量方面分析了系统性能。结果表明,所提出的基于主题建模的决策框架能够在写入话语的同时对主题进行分类,准确性为91.0%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号