首页> 中文期刊> 《计算机应用与软件》 >基于新闻要素的在线新事件检测

基于新闻要素的在线新事件检测

         

摘要

The main task of online new event detection ( ONED) is to distinguish unknown events from chronological news reports .We propose an automatic ONED method which is based on the news elements .First, the method builds a news elements-based representation model for events and reports , the model includes the elements of news report including place , people and content , the use of multi-dimension-al elements has the advantage in being able to differentiate similar events; it provides corresponding similarity algorithms for calculating the similarity of each element ’ s corresponding feature: geographical ontology-based toponym similarity algorithm is used to calculate the place similarity, and Wikipedia-based semantic similarity algorithm is used to calculate the similarity between the contents of report ; in order to balance the importance of each element , the weight of each element is derived from the training which uses SVM model ;Finally, taking the single-pass clustering algorithm as the basis , the event representation vector is modified constantly in the process of the algorithm to prevent the drift of event centre .Meanwhile the slipped time window is used to decrease the time cost caused by dealing with a lot of inactive events . Experimental results show that the algorithm can effectively reduce the miss probability and false -alarm probability of the system , improves the performance of the event detection .%在线新事件检测的主要任务是从以时间顺序到来的新闻报道中识别出未知事件。提出一种基于新闻要素的自动在线新事件检测方法。首先,构建基于新闻要素的报道和事件表示模型,该模型包括新闻报道地点、人物和内容等要素,使用多维要素的优越性在于可以区别相似事件;为计算各要素对应特征的相似度提供对应的相似度算法:使用基于地理本体树的地名相似度算法计算地点相似度,使用基于维基百科的语义相似度计算方法计算报道内容之间的相似度;为了衡量各要素的重要性,使用SVM模型训练得出各要素的权值;最后,以single-pass聚类算法为基础,在算法过程中不断修改事件的表示向量以防止事件中心的漂移,同时使用滑动的时间窗口以减少因处理大量不活跃事件引起的时间消耗。实验结果表明该方法可以有效地降低系统的漏检率和误检率,提高事件检测的性能。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号