【24h】

Detecting Events in a Million New York Times Articles

机译:在数百万《纽约时报》的文章中检测事件

获取原文

摘要

We present a demonstration of a newly developed text stream event detection method on over a million articles from the New York Times corpus. The event detection is designed to operate in a predominantly on-line fashion, reporting new events within a specified timeframe. The event detection is achieved by detecting significant changes in the statistical properties of the text where those properties are efficiently stored and updated in a suffix tree. This particular demonstration shows how our method is effective at discovering both short- and long-term events (which are often denoted topics), and how it automatically copes with topic drift on a corpus of 1 035 263 articles.
机译:我们将展示一种新开发的文本流事件检测方法的演示,该方法可用于“纽约时报”语料库中超过一百万篇文章。事件检测旨在以一种主要的在线方式进行操作,在指定的时间范围内报告新事件。事件检测是通过检测文本的统计属性的重大变化来实现的,这些属性在后缀树中被有效地存储和更新。这个特殊的演示展示了我们的方法如何有效地发现短期和长期事件(通常被称为主题),以及它如何自动处理1 035 263篇文章的主题漂移。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号