【24h】

Near real-time atrocity event coding

机译:近实时违约事件编码

获取原文

摘要

In recent years, mass atrocities, terrorism, and political unrest have caused much human suffering. Thousands of innocent lives have been lost to these events. With the help of advanced technologies, we can now dream of a tool that uses machine learning and natural language processing (NLP) techniques to warn of such events. Detecting atrocities demands structured event data that contain metadata, with multiple fields and values (e.g. event date, victim, perpetrator). Traditionally, humans apply common sense and encode events from news stories but this process is slow, expensive, and ambiguous. To accelerate it, we use machine coding to generate an encoded event. In this paper, we develop a near-real-time supervised machine coding technique with an external knowledge base, WordNet, to generate a structured event. We design a Spark-based distributed framework with a web scraper to gather news reports periodically, process, and generate events. We use Spark to reduce the performance bottleneck while processing raw text news using CoreNLP.
机译:近年来,大规模暴行,恐怖主义和政治动荡,引起了许多人的痛苦。成千上万的无辜生命已经失去了这些事件。随着先进技术的帮助下,我们现在可以梦想的工具,它使用机器学习和自然语言处理(NLP)技术,以警告此类事件的。检测暴行需求结构包含元数据的,具有多个字段和值(例如事件日期,受害者,肇事者)事件数据。传统上,人类运用常识和编码事件的新闻报道,但这个过程是缓慢的,昂贵的,模棱两可的。为了加速它,我们用机器编码,以生成编码的事件。在本文中,我们开发了一个近乎实时的监督机编码技术与外部的知识基础,共发现,产生结构化的事件。我们设计了一个基于火花分布式框架与web刮刀定期收集新闻报道,处理和生成的事件。我们用星火降低性能瓶颈,同时处理使用CoreNLP原始文本消息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号