Combining domain-specific heuristics for author name disambiguation

机译：结合特定领域的启发式方法来消除作者名称的歧义

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Author name disambiguation has been one of the hardest problems faced by digital libraries since their early days. Historically, supervised solutions have empirically outperformed those based on heuristics, but with the burden of having to rely on manually labelled training sets for the learning process. Moreover, most supervised solutions just apply some type of generic machine learning solution and do not exploit specific knowledge about the problem. In this paper, we follow a similar reasoning, but in the opposite direction. Instead of extending an existing supervised solution, we propose a set of carefully designed heuristics and similarity functions and apply supervision only to optimize such parameters for each particular dataset. As our experiments show, the result is a very effective, efficient and practical author name disambiguation method that can be used in many different scenarios.

机译：自数字图书馆成立以来，作者名称的歧义一直是数字图书馆面临的最困难的问题之一。从历史上看，有监督的解决方案在经验上优于基于启发式的解决方案，但其负担是必须依赖手动标记的训练集进行学习。而且，大多数受监督的解决方案仅应用某种类型的通用机器学习解决方案，而没有利用有关该问题的特定知识。在本文中，我们遵循类似的推理，但方向相反。我们没有扩展现有的监督解决方案，而是提出了一组经过精心设计的启发式和相似性函数，并仅应用监督来针对每个特定数据集优化此类参数。如我们的实验所示，结果是一种非常有效，高效且实用的作者姓名歧义消除方法，可以在许多不同的情况下使用。

著录项

来源
《2014 IEEE/ACM Joint Conference on Digital Libraries》|2014年|173-182|共10页
会议地点 London(GB)
作者
Santana A.F.; Goncalves M.A.; Laender A.H.F.; Ferreira A.;
展开▼
作者单位

Dept. de Cienc. da Comput., Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
data analysis; digital libraries; learning (artificial intelligence); author name disambiguation; dataset; digital libraries; domain-specific heuristics; generic machine learning solution; heuristics; similarity functions; supervised solutions; Electronic mail; Equations; Mathematical model; Measurement; Training; Training data; Vectors; Name Disambiguation; Supervised Methods;

机译：数据分析;数字图书馆;学习（人工智能）;作者姓名歧义;数据集;数字图书馆;领域特定启发式;通用机器学习解决方案;启发式;相似性函数;监督解决方案;电子邮件;方程式;数学模型;测量;培训;训练数据;向量;名称消歧;监督方法;;
入库时间 2022-08-26 13:55:29

相似文献

外文文献
中文文献
专利

1. Incremental Author Name Disambiguation by Exploiting Domain-Specific Heuristics [J] . Alan Filipe Santana, Marcos Andre Goncalves, Alberto H. F. Laender, Journal of the American Society for Information Science and Technology . 2017,第4期

机译：利用特定领域的启发式方法来递增作者姓名歧义
2. On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method [J] . Alan Filipe San tana, Marcos Andre Goncalves, Alberto H. F. Laender, International journal on digital libraries . 2015,第3a4期

机译：针对作者姓名歧义的领域特定启发式方法的组合：最近的聚类方法
3. On the combination of domain-specific heuristics for author name disambiguation: the nearest cluster method [J] . Alan Filipe Santana, Marcos André Gonçalves, Alberto H. F. Laender, International Journal on Digital Libraries . 2015,第3a4期

机译：针对作者姓名歧义的领域特定启发式方法的组合：最近的聚类方法
4. Combining domain-specific heuristics for author name disambiguation [C] . Santana A.F., Goncalves M.A., Laender A.H.F., IEEE/ACM Joint Conference on Digital Libraries . 2014

机译：结合域名特定启发式作者名称歧义
5. Author Name Disambiguation Using Co-Training [D] . Gao, Yan. 2020

机译：作者名称使用共同培训歧义
6. Author Disambiguation in PubMed: Evidence on the Precision and Recall of Author-ity among NIH-Funded Scientists [O] . Marc J. Lerchenmueller, Olav Sorenson -1

机译：在PubMed中消除作者的歧义：有关NIH资助的科学家的准确性和召回权的证据
7. refsplitr: Author name disambiguation, author georeferencing, and mapping of coauthorship networks with Web of Science data [O] . Auriel Fournier, Matthew Boone, Forrest Stevens, 2020

机译：Refsplitr：作者姓名歧义，作者地理学，与科学数据网的共同努力网络映射

Combining domain-specific heuristics for author name disambiguation

摘要

著录项

相似文献

相关主题

期刊订阅