首页> 外文期刊>Quality Control, Transactions >CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph
【24h】

CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph

机译:CAG:使用共同作者图形的多作者文档的仪表作主归属

获取原文
获取原文并翻译 | 示例
       

摘要

Stylometry has been successfully applied to perform authorship identification of single-author documents (AISD). The AISD task is concerned with identifying the original author of an anonymous document from a group of candidate authors. However, AISD techniques are not applicable to the authorship identification of multi-author documents (AIMD). Unlike AISD, where each document is written by one single author, AIMD focuses on handling multi-author documents. Due to the combinatoric nature of documents, AIMD lacks the ground truth information & x2014;that is, information on writing and non-writing authors in a multi-author document & x2014;which makes this problem more challenging to solve. Previous AIMD solutions have a number of limitations: (i) the best stylometry-based AIMD solution has a low accuracy, less than 30 & x0025;; (ii) increasing the number of co-authors of papers adversely affects the performance of AIMD solutions; and (iii) AIMD solutions were not designed to handle the non-writing authors (NWAs). However, NWAs exist in real-world cases & x2014;that is, there are papers for which not every co-author listed has contributed as a writer. This paper proposes an AIMD framework called the Co-Authorship Graph that can be used to (i) capture the stylistic information of each author in a corpus of multi-author documents and (ii) make a multi-label prediction for a multi-author query document. We conducted extensive experimental studies on one synthetic and three real-world corpora. Experimental results show that our proposed framework (i) significantly outperformed competitive techniques; (ii) can effectively handle a larger number of co-authors in comparison with competitive techniques; and (iii) can effectively handle NWAs in multi-author documents.
机译:STYROMERY已成功应用于执行单作者文件(AISD)的作者身份识别。 AISD任务涉及识别来自一组候选人作者的匿名文档的原始作者。但是,AISD技术不适用于多作者文件(AIMD)的作者身份识别。与AISD不同,每个文档由一个单一作者编写,Aimd侧重于处理多作者文件。由于文件的组合性质,Aimd缺乏地面真理信息和X2014;也就是说,有关多作者文件和X2014的写作和非写作作者的信息;这使得这个问题更具挑战性。以前的AIMD解决方案具有许多限制:(i)基于最佳的驾驶员型AIMD解决方案具有低精度,小于30&x0025 ;; (ii)增加论文的共同作者数量对AIMD解决方案的表现产生不利影响; (iii)AIMD解决方案旨在处理非写作作者(NWAS)。然而,NWA存在于现实世界案例中,也存在,也就是说,有些文件并非每个合作者所列的文件都作为作家捐款。本文提出了一个称为共同作者框架的AIMD框架,可用于(i)捕获多作者文档语料库中的每个作者的风格信息,(ii)为多作者制作多标签预测查询文档。我们对一个合成和三个真实世界的Corpora进行了广泛的实验研究。实验结果表明,我们提出的框架(i)显着优于竞争性技巧; (ii)与竞争技术相比,可以有效处理更多的共同作者; (iii)可以有效地处理多作者文件中的NWA。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号