首页> 外文期刊>Journal of machine learning research >Effective String Processing and Matching for Author Disambiguation
【24h】

Effective String Processing and Matching for Author Disambiguation

机译:有效的字符串处理和匹配,以消除作者的歧义

获取原文
获取外文期刊封面目录资料

摘要

Track 2 of KDD Cup 2013 aims at determining duplicated authorsin a data set from Microsoft Academic Search. This type ofproblems appears in many large-scale applications that compileinformation from different sources. This paper describes oursolution developed at National Taiwan University to win thefirst prize of the competition. We propose an effective namematching framework and realize two implementations. An importantstrategy in our approach is to consider Chinese and non-Chinesenames separately because of their different naming conventions.Post-processing including merging results of two predictionsfurther boosts the performance. Our approach achieves F1-score0.99202 on the private leader board, while 0.99195 on the publicleader board. color="gray">
机译:KDD Cup 2013的第2轨旨在确定Microsoft学术搜索中数据集中的重复作者。这种类型的问题出现在许多从不同来源收集信息的大型应用程序中。本文介绍了我们在国立台湾大学开发的赢得比赛一等奖的解决方案。我们提出一个有效的名称匹配框架并实现两个实现。我们的方法中的一项重要策略是因为中文和非中文名称的命名约定不同而分别考虑它们。后处理(包括两个预测结果的合并)进一步提高了性能。我们的方法在私人排行榜上达到F1-score0.99202,而在公共排行榜上达到0.99195。 color =“ gray”>

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号