Strong Baselines for Author Name Disambiguation with and Without Neural Networks

机译：有和没有神经网络的作者姓名歧义消除的强大基准

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Author name disambiguation (AND) is one of the most vital problems in scientometrics, which has become a great challenge with the rapid growth of academic digital libraries. Existing approaches for this task substantially rely on complex clustering-like architectures, and they usually assume the number of clusters is known beforehand or predict the number by applying another model, which involve increasingly complex and time-consuming architectures. In this paper, we combine simple neural networks with two sets of heuristic rules to explore strong baselines for the author name disambiguation problem without any priori knowledge or estimation about cluster size, which frees the model from unnecessary complexity. On a popular benchmark dataset AMiner, our solution significantly outperforms several state-of-the-art methods both in performance and efficiency, and it still achieves comparable performance with many complex models when only using a group of rules. Experimental results also indicate that gains from sophisticated deep learning techniques are quite modest in the author name disambiguation problem.

机译：作者名称歧义消除（AND）是科学计量学中最重要的问题之一，随着学术数字图书馆的快速发展，这已成为一个巨大的挑战。用于该任务的现有方法基本上依赖于复杂的类似于群集的体系结构，并且它们通常假定群集的数目是事先已知的，或者通过应用另一种模型来预测群集的数目，该模型涉及越来越复杂和耗时的体系结构。在本文中，我们将简单的神经网络与两组启发式规则相结合，以探索作者名称歧义消除问题的强大基线，而无需任何先验知识或簇大小估计，这使模型摆脱了不必要的复杂性。在流行的基准数据集AMiner上，我们的解决方案在性能和效率上均大大优于几种最新方法，并且仅使用一组规则，它仍可以与许多复杂模型实现可比的性能。实验结果还表明，在作者姓名消除歧义问题中，复杂的深度学习技术所带来的收益是很小的。

著录项

来源
《Pacific-Asia Conference on Knowledge Discovery and Data Mining》|2020年|369-381|共13页
会议地点
作者
Zhenyu Zhang; Bowen Yu; Tingwen Liu; Dong Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Author name disambiguation; Heuristic rules; Clustering problem; Baseline methods;

机译：作者名称歧义;启发式规则;聚类问题;基线方法;

相似文献

外文文献
中文文献
专利

1. Scale-Free Collaboration Networks: An Author Name Disambiguation Perspective [J] . Kim Jinseok Journal of the American Society for Information Science and Technology . 2019,第7期

机译：无标度协作网络：作者姓名消歧观点
2. Scale-Free Collaboration Networks: An Author Name Disambiguation Perspective [J] . Kim Jinseok Journal of the American Society for Information Science and Technology . 2019,第7期

机译：无规模的协作网络：作者名称歧义透视
3. Distortive Effects of Initial-Based Name Disambiguation on Measurements of Large-Scale Coauthorship Networks [J] . Jinseok Kim, Jana Diesner Journal of the American Society for Information Science and Technology . 2016,第6期

机译：初始名称歧义化对大规模共同作者网络度量的扭曲效应
4. Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks [C] . Salman Mohammed, Peng Shi, Jimmy Lin Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2018

机译：有和没有神经网络的知识图简单问答的强大基准
5. Author Name Disambiguation Using Co-Training [D] . Gao, Yan. 2020

机译：作者名称使用共同培训歧义
6. Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks [O] . Canlin Zhang, Daniel Biś, Xiuwen Liu, 2019

机译：具有双向长期短期记忆和基于注意力的神经网络的生物医学单词义消歧
7. Strong Baselines for Author Name Disambiguation with and Without Neural Networks [O] . Zhenyu Zhang, Bowen Yu, Tingwen Liu, 2020

机译：作者名称歧义的强大基线与无内部网络歧义

Strong Baselines for Author Name Disambiguation with and Without Neural Networks

摘要

著录项

相似文献

相关主题

期刊订阅