【24h】

Clustering Technique in Multi-Document Personal Name Disambiguation

机译:多文档人名消歧中的聚类技术

获取原文

摘要

Focusing on multi-document personal name disambiguation, this paper develops an agglo-merative clustering approach to resolving this problem. We start from an analysis of point-wise mutual information between feature and the ambiguous name, which brings about a novel weight computing method for feature in clustering. Then a trade-off measure between within-cluster compactness and among-cluster separation is proposed for stopping clustering. After that, we apply a labeling method to find representative feature for each cluster. Finally, experiments are conducted on word-based clustering in Chinese dataset and the result shows a good effect.
机译:针对多文档人名歧义消除,本文提出了一种聚类聚类方法来解决此问题。我们从特征和模棱两可的名称之间的点向互信息的分析开始,这为聚类中的特征带来了一种新颖的权重计算方法。然后,提出了一种在集群内部的紧密度和集群之间的分离之间权衡的措施,以阻止聚类。之后,我们应用标记方法为每个聚类找到代表性特征。最后,对中文数据集中基于词的聚类进行了实验,结果显示了良好的效果。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号