CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph

Sarwar Raheem; Urailertprasert Norawit; Vannaboot Nattapol; Yu Chenyun; Rakthanmanon Thanawin; Chuangsuwanich Ekapol; Nutanong Sarana

首页> 外文期刊>Quality Control, Transactions >CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph

【24h】

CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph

机译：CAG：使用共同作者图形的多作者文档的仪表作主归属

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stylometry has been successfully applied to perform authorship identification of single-author documents (AISD). The AISD task is concerned with identifying the original author of an anonymous document from a group of candidate authors. However, AISD techniques are not applicable to the authorship identification of multi-author documents (AIMD). Unlike AISD, where each document is written by one single author, AIMD focuses on handling multi-author documents. Due to the combinatoric nature of documents, AIMD lacks the ground truth information & x2014;that is, information on writing and non-writing authors in a multi-author document & x2014;which makes this problem more challenging to solve. Previous AIMD solutions have a number of limitations: (i) the best stylometry-based AIMD solution has a low accuracy, less than 30 & x0025;; (ii) increasing the number of co-authors of papers adversely affects the performance of AIMD solutions; and (iii) AIMD solutions were not designed to handle the non-writing authors (NWAs). However, NWAs exist in real-world cases & x2014;that is, there are papers for which not every co-author listed has contributed as a writer. This paper proposes an AIMD framework called the Co-Authorship Graph that can be used to (i) capture the stylistic information of each author in a corpus of multi-author documents and (ii) make a multi-label prediction for a multi-author query document. We conducted extensive experimental studies on one synthetic and three real-world corpora. Experimental results show that our proposed framework (i) significantly outperformed competitive techniques; (ii) can effectively handle a larger number of co-authors in comparison with competitive techniques; and (iii) can effectively handle NWAs in multi-author documents.

机译：STYROMERY已成功应用于执行单作者文件（AISD）的作者身份识别。 AISD任务涉及识别来自一组候选人作者的匿名文档的原始作者。但是，AISD技术不适用于多作者文件（AIMD）的作者身份识别。与AISD不同，每个文档由一个单一作者编写，Aimd侧重于处理多作者文件。由于文件的组合性质，Aimd缺乏地面真理信息和X2014;也就是说，有关多作者文件和X2014的写作和非写作作者的信息;这使得这个问题更具挑战性。以前的AIMD解决方案具有许多限制：（i）基于最佳的驾驶员型AIMD解决方案具有低精度，小于30＆x0025 ;; （ii）增加论文的共同作者数量对AIMD解决方案的表现产生不利影响; （iii）AIMD解决方案旨在处理非写作作者（NWAS）。然而，NWA存在于现实世界案例中，也存在，也就是说，有些文件并非每个合作者所列的文件都作为作家捐款。本文提出了一个称为共同作者框架的AIMD框架，可用于（i）捕获多作者文档语料库中的每个作者的风格信息，（ii）为多作者制作多标签预测查询文档。我们对一个合成和三个真实世界的Corpora进行了广泛的实验研究。实验结果表明，我们提出的框架（i）显着优于竞争性技巧; （ii）与竞争技术相比，可以有效处理更多的共同作者; （iii）可以有效地处理多作者文件中的NWA。

著录项

来源
《Quality Control, Transactions》 |2020年第2020期|18374-18393|共20页
作者
Sarwar Raheem; Urailertprasert Norawit; Vannaboot Nattapol; Yu Chenyun; Rakthanmanon Thanawin; Chuangsuwanich Ekapol; Nutanong Sarana;
展开▼
作者单位

Vidyasirimedhi Inst Sci & Technol Sch Informat Sci & Technol Rayong 21210 Thailand;

Vidyasirimedhi Inst Sci & Technol Sch Informat Sci & Technol Rayong 21210 Thailand;

Vidyasirimedhi Inst Sci & Technol Sch Informat Sci & Technol Rayong 21210 Thailand;

Natl Univ Singapore Dept Comp Sci Singapore 119077 Singapore;

Vidyasirimedhi Inst Sci & Technol Sch Informat Sci & Technol Rayong 21210 Thailand|Kasetsart Univ Dept Comp Engn Bangkok 10900 Thailand;

Chulalongkorn Univ Dept Comp Engn Fac Engn Bangkok 10330 Thailand;

Vidyasirimedhi Inst Sci & Technol Sch Informat Sci & Technol Rayong 21210 Thailand;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Set similarity search; multi-author documents; co-authorship graph; authorship identification; stylometry; scientometrics;

机译：设置相似性搜索;多作者文件;共同作者图形;作者身份识别;练习型;科学计量学;
入库时间 2022-08-18 21:58:48

相似文献

外文文献
中文文献
专利

1. Dropping down the Maximum Item Set: Improving the Stylometric Authorship Attribution Algorithm in the Text Mining for Authorship Investigation [J] . Tareef Kamil Mustafa, Norwati Mustapha, Masrah Azrifah Azmi, Journal of computer sciences . 2010,第3期

机译：删除最大项目集：改进用于作者调查的文本挖掘中的风格作者归属算法
2. Dropping down the Maximum Item Set: Improving the Stylometric Authorship Attribution Algorithm in the Text Mining for Authorship Investigation | Science Publications [J] . Masrah A. Azmi, Nasir B. Sulaiman, Norwati Mustapha, Journal of computer sciences . 2010,第3期

机译：删除最大项目集：改进用于作者调查的文本挖掘中的风格著作权归属算法科学出版物
3. Authorship Attribution of Short Historical Arabic Texts using Stylometric Features and a KNN Classifier with Limited Training Data [J] . Fatma Howedi, Masnizah Mohd, Zahra Aborawi Aborawi, Journal of computer sciences . 2020,第10期

机译：短期阿拉伯语文本的作者归属使用仪表特征和具有有限培训数据的KNN分类器
4. Stylometric Authorship Attribution of Collaborative Documents [C] . Edwin Dauber, Rebekah Overdorf, Rachel Greenstadt Cyber Security Cryptography and Machine Learning . 2017

机译：协作文档的样式作者权归属
5. Stylometric Authorship Attribution Techniques and Analysis for Collaborative Platforms [D] . Dauber , Edwin George, Jr. 2020

机译：协作平台的款式作者归属技术与分析
6. Co-Authorship and Bibliographic Coupling Network Effects on Citations [O] . Claudio Biscaro, Carlo Giupponi -1

机译：共同作者和书目耦合网络对引文的影响
7. $CAG$ : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph [O] . Raheem Sarwar, Norawit Urailertprasert, Nattapol Vannaboot, 2020

机译：$ CAG $：使用共同作者图形的多作者文件的款式验证归属

CAG : Stylometric Authorship Attribution of Multi-Author Documents Using a Co-Authorship Graph

摘要

著录项

相似文献

相关主题

期刊订阅