首页> 外文期刊>International Journal of Computer Science & Information Technology (IJCSIT) >Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM
【24h】

Gender and Authorship Categorisation of Arabic Text from Twitter Using PPM

机译:使用PPM从Twitter提取阿拉伯文本的性别和作者身份分类

获取原文
           

摘要

In this paper we present gender and authorship categorisationusing the Prediction by Partial Matching(PPM) compression scheme for text from Twitter written in Arabic. The PPMD variant of the compressionscheme with different orders was used to perform the categorisation. We also applied different machinelearning algorithms such as Multinational Na飗e Bayes (MNB), K-Nearest Neighbours (KNN), and animplementation of Support Vector Machine (LIBSVM), applying the same processing steps for all thealgorithms. PPMD shows significantly better accuracy in comparison to all the other machine learningalgorithms, with order 11 PPMD working best, achieving 90 % and 96% accuracy for gender andauthorship respectively.
机译:在本文中,我们使用部分匹配预测(PPM)压缩方案对来自Twitter的阿拉伯语文本进行了性别和作者身份分类。具有不同顺序的压缩方案的PPMD变体用于执行分类。我们还应用了不同的机器学习算法,例如跨国Nause Bayes(MNB),K最近邻(KNN)和支持向量机(LIBSVM)的实现,并对所有算法应用了相同的处理步骤。与所有其他机器学习算法相比,PPMD的准确性显着提高,第11级PPMD的效果最佳,性别和作者的准确性分别达到90%和96%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号