首页> 外文会议>2014 IEEE/ACM Joint Conference on Digital Libraries >Combining domain-specific heuristics for author name disambiguation
【24h】

Combining domain-specific heuristics for author name disambiguation

机译:结合特定领域的启发式方法来消除作者名称的歧义

获取原文
获取原文并翻译 | 示例

摘要

Author name disambiguation has been one of the hardest problems faced by digital libraries since their early days. Historically, supervised solutions have empirically outperformed those based on heuristics, but with the burden of having to rely on manually labelled training sets for the learning process. Moreover, most supervised solutions just apply some type of generic machine learning solution and do not exploit specific knowledge about the problem. In this paper, we follow a similar reasoning, but in the opposite direction. Instead of extending an existing supervised solution, we propose a set of carefully designed heuristics and similarity functions and apply supervision only to optimize such parameters for each particular dataset. As our experiments show, the result is a very effective, efficient and practical author name disambiguation method that can be used in many different scenarios.
机译:自数字图书馆成立以来,作者名称的歧义一直是数字图书馆面临的最困难的问题之一。从历史上看,有监督的解决方案在经验上优于基于启发式的解决方案,但其负担是必须依赖手动标记的训练集进行学习。而且,大多数受监督的解决方案仅应用某种类型的通用机器学习解决方案,而没有利用有关该问题的特定知识。在本文中,我们遵循类似的推理,但方向相反。我们没有扩展现有的监督解决方案,而是提出了一组经过精心设计的启发式和相似性函数,并仅应用监督来针对每个特定数据集优化此类参数。如我们的实验所示,结果是一种非常有效,高效且实用的作者姓名歧义消除方法,可以在许多不同的情况下使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号