...
首页> 外文期刊>Information Processing & Management >Cross-document event clustering using knowledge mining from co-reference chains
【24h】

Cross-document event clustering using knowledge mining from co-reference chains

机译:使用来自共同参考链的知识挖掘进行跨文档事件聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Unifying terminology usages which captures more term semantics is useful for event clustering. This paper proposes a metric of normalized chain edit distance to mine, incrementally, controlled vocabulary from cross-document co-reference chains. Controlled vocabulary is employed to unify terms among different co-reference chains. A novel threshold model that incorporates both time decay function and spanning window uses the controlled vocabulary for event clustering on streaming news. Under correct co-reference chains, the proposed system has a 15.97% performance increase compared to the baseline system, and a 5.93% performance increase compared to the system without introducing controlled vocabulary. Furthermore, a Chinese co-reference resolution system with a chain filtering mechanism is used to experiment on the robustness of the proposed event clustering system. The clustering system using noisy co-reference chains still achieves a 10.55% performance increase compared to the baseline system. The above shows that our approach is promising. (c) 2006 Elsevier Ltd. All rights reserved.
机译:捕获更多术语语义的统一术语用法对于事件聚类很有用。本文提出了一种归一化链编辑距离的度量标准,用于从跨文档共同引用链中挖掘,增量地控制词汇。受控词汇表用于统一不同共同参考链中的术语。结合了时间衰减功能和跨度窗口的新型阈值模型使用受控词汇对流新闻进行事件聚类。在正确的共同参考链下,与基线系统相比,拟议系统的性能提高了15.97%,与不引入受控词汇的系统相比,性能提高了5.93%。此外,使用带有链过滤机制的中文共参考解析系统来对所提出的事件聚类系统的鲁棒性进行实验。与基准系统相比,使用嘈杂共参照链的聚类系统仍可实现10.55%的性能提升。以上说明我们的方法很有希望。 (c)2006 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号