What You Submit Is Who You Are: A Multimodal Approach for Deanonymizing Scientific Publications

Payer M.; Huang L.; Gong N.Z.; Borgolte K.; Frank M.

首页> 外文期刊>Information Forensics and Security, IEEE Transactions on >What You Submit Is Who You Are: A Multimodal Approach for Deanonymizing Scientific Publications

【24h】

What You Submit Is Who You Are: A Multimodal Approach for Deanonymizing Scientific Publications

机译：您提交的是您是谁：对科学出版物进行匿名处理的多模式方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The peer-review system of most academic conferences relies on the anonymity of both the authors and reviewers of submissions. In particular, with respect to the authors, the anonymity requirement is heavily disputed and pros and cons are discussed exclusively on a qualitative level. In this paper, we contribute a quantitative argument to this discussion by showing that it is possible for a machine to reveal the identity of authors of scientific publications with high accuracy. We attack the anonymity of authors using statistical analysis of multiple heterogeneous aspects of a paper, such as its citations, its writing style, and its content. We apply several multilabel, multiclass machine learning methods to model the patterns exhibited in each feature category for individual authors and combine them to a single ensemble classifier to deanonymize authors with high accuracy. To the best of our knowledge, this is the first approach that exploits multiple categories of discriminative features and uses multiple, partially complementing classifiers in a single, focused attack on the anonymity of the authors of an academic publication. We evaluate our author identification framework, deAnon, based on a real-world data set of 3894 papers. From these papers, we target 1405 productive authors that each have at least three publications in our data set. Our approach returns a ranking of probable authors for anonymous papers, an ordering for guessing the authors of a paper. In our experiments, following this ranking, the first guess corresponds to one of the authors of a paper in 39.7% of the cases, and at least one of the authors is among the top 10 guesses in 65.6% of all cases. Thus, deAnon significantly outperforms current state-of-the-art techniques for automatic deanonymization.

机译：大多数学术会议的同行评审系统都依赖于提交者和评审者的匿名性。特别是，对于作者而言，匿名性要求存在很大争议，其优缺点仅在质量上进行讨论。在本文中，我们通过展示一种机器有可能高精度地揭示科学出版物作者的身份，为这一讨论提供了定量的论据。我们使用对论文的多个不同方面的统计分析来攻击作者的匿名性，例如论文的引文，写作风格和内容。我们应用了几种多标签，多类机器学习方法来对每个作者的每个特征类别中显示的模式进行建模，并将它们组合到单个整体分类器中，从而以较高的准确性对作者进行匿名处理。据我们所知，这是第一种利用多种区分特征的方法，并在针对学术出版物作者的匿名性的一次集中攻击中使用多个部分补充的分类器。我们基于3894篇论文的真实数据集评估我们的作者识别框架deAnon。从这些论文中，我们针对1405位富有成效的作者，他们在我们的数据集中至少拥有三篇出版物。我们的方法返回匿名论文的可能作者排名，以猜测论文作者的顺序。在我们的实验中，按照该排名，在39.7％的案例中，第一个猜测与一位论文的作者相对应，在所有案例的65.6％中，至少有一位作者名列前十名。因此，deAnon明显优于自动去匿名化的最新技术。

著录项

来源
《Information Forensics and Security, IEEE Transactions on》 |2015年第1期|200-212|共13页
作者
Payer M.; Huang L.; Gong N.Z.; Borgolte K.; Frank M.;
展开▼
作者单位

West Lafayette, Purdue University, IN, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Accuracy; Data mining; Feature extraction; Portable document format; Support vector machines; Training; Writing; Data privacy; text analysis; text mining;

机译：准确性;数据挖掘;特征提取;便携式文档格式;支持向量机;培训;撰写;数据隐私;文本分析;文本挖掘;

相似文献

外文文献
中文文献
专利

1. Predicting future citation counts of scientific manuscripts submitted for publication: a cohort study in transplantology [J] . Kossmeier Michael, Heinze Georg Transplant international : . 2019,第1期

机译：预测提交出版的科学手稿的未来引用计数：移植术中的队列研究
2. Does direction of results of abstracts submitted to scientific conferences on drug addiction predict full publication? [J] . Simona Vecchi, Valeria Belleudi, Laura Amato, BMC Medical Research Methodology . 2009,第1期

机译：提交给有关药物成瘾的科学会议的摘要结果的方向是否可以预测全文发表？
3. Publication bias in gastroenterological research – a retrospective cohort study based on abstracts submitted to a scientific meeting [J] . Antje Timmer, Robert J Hilsden, John Cole, BMC Medical Research Methodology . 2002,第1期

机译：胃肠病学研究中的出版偏倚–基于提交给科学会议的摘要的回顾性队列研究
4. Coner: A Collaborative Approach for Long-Tail Named Entity Recognition in Scientific Publications [C] . Daniel Vliegenthart, Sepideh Mesbah, Christoph Lofi, International conference on theory and practice of digital libraries . 2019

机译：Coner：科学出版物中长尾命名实体识别的协作方法
5. Capturing and Exploiting Citation Knowledge for the Recommendation of Scientific Publications [D] . Khadka, Anita. 2020

机译：捕捉和利用科学出版物建议的引文知识
6. Does direction of results of abstracts submitted to scientific conferences on drug addiction predict full publication? [O] . Simona Vecchi, Valeria Belleudi, Laura Amato, 2009

机译：提交给关于药物成瘾的科学会议的摘要结果的方向是否可以预测全文发表？
7. Abstract not submitted at time of publication: 2011 SCMR/Euro CMR Joint Scientific Sessions [O] . 2011

机译：摘要未在出版时提交：2011 sCmR / Euro CmR联合科学会议
8. Preparing and Submitting Scientific and Technical Manuscripts and Other Documents for Publication [R] . 2005

机译：准备和提交科学技术手稿和其他文件以供出版

What You Submit Is Who You Are: A Multimodal Approach for Deanonymizing Scientific Publications

摘要

著录项

相似文献

相关主题

期刊订阅