...
首页> 外文期刊>Digital investigation >Automatic categorization of Arabic articles based on their political orientation
【24h】

Automatic categorization of Arabic articles based on their political orientation

机译:根据阿拉伯文文章的政治倾向对文章进行自动分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The ability to automatically determine the political orientation of an article can be of great benefit in many areas from academia to security. However, this problem has been largely understudied for Arabic texts in the literature. The contribution of this work lies in two aspects. First, collecting and manually labeling a corpus of articles and comments from different political orientations in the Arab world and making different versions of it. Second, studying the performance of various feature reduction methods and various classifiers on these synthesized datasets. The two most popular feature extraction approaches for such a problem were compared, namely the Traditional Text Categorization (TC) approach and the Stylometric Features approach (SF). Although the experimental results show the superiority of the TC approach over the SF approach, the results also indicate that the latter approach can be significantly improved by adding new and more discriminating features. The experimental results also show that the feature selection techniques reduce the accuracies of the considered classifiers under the TC and SF approaches in general. The only exception is the Partition Membership (PM) technique which has an opposite effect. The highest accuracies are obtained when PM feature selection method is used with the Support Vector Machine (SVM) classifier. (C) 2018 Elsevier Ltd. All rights reserved.
机译:从学术界到安全性的许多领域,自动确定文章的政治取向的能力可能会大大受益。但是,对于阿拉伯文本在文献中,这个问题已经被大大地研究了。这项工作的贡献在于两个方面。首先,收集并手动标记来自阿拉伯世界不同政治方向的文章和评论的语料库,并制作不同版本的文章和评论。其次,研究这些合成数据集上各种特征约简方法和各种分类器的性能。比较了针对此问题的两种最流行的特征提取方法,即传统文本分类(TC)方法和样式特征方法(SF)。尽管实验结果表明TC方法优于SF方法,但结果还表明,可以通过添加新的和更具区分性的功能来显着改善后一种方法。实验结果还表明,特征选择技术通常会降低TC和SF方法下考虑的分类器的准确性。唯一的例外是分区成员资格(PM)技术,其作用相反。当PM特征选择方法与支持向量机(SVM)分类器一起使用时,可以获得最高的准确性。 (C)2018 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号