首页> 外文会议>International conference on computer, information and telecommunication systems >Authorship attribution of ancient texts written by ten Arabic travelers using character N-Grams
【24h】

Authorship attribution of ancient texts written by ten Arabic travelers using character N-Grams

机译:十位阿拉伯旅行者使用字符N-Grams撰写的古代文字的著作权归属

获取原文

摘要

In this paper the authors investigate the authorship of some old Arabic books that are written by ten ancient Arabic travelers. Hence, several experiments of authorship attribution are conducted on these Arabic texts, by using different features such as characters, character-bigrams, character-trigrams and character-tetragrams. Furthermore, four different classifiers are employed, namely: Stamatatos distance, Manhattan distance, Multi Layer Perceptron (MLP) and Support Vector Machines (SVM). For the evaluation task, several experiments of authorship attribution, using those features and classifiers, are conducted on the Arabic dataset (called AAAT), which contains 3 short texts from every book. Results show good authorship attribution performances with an optimal score of 90% of good attribution. Moreover, this investigation has revealed interesting results concerning the Arabic language.
机译:在本文中,作者调查了由十位古代阿拉伯旅行者撰写的一些古老阿拉伯书籍的作者身份。因此,通过使用不同的特征,例如字符,字符二字组,字符三字组和字符四字组,对这些阿拉伯文本进行了几项作者归属归因的实验。此外,采用了四个不同的分类器,即:Stamatatos距离,Manhattan距离,多层感知器(MLP)和支持向量机(SVM)。对于评估任务,使用阿拉伯语数据集(称为AAAT)对使用这些特征和分类器的作者身份归属进行了多次实验,其中每本书包含3篇短文。结果显示出良好的作者身份归因表现,最佳评分为良好归因的90%。此外,这项调查还发现了有关阿拉伯语的有趣结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号