首页> 外文会议>IEEE Ecuador Technical Chapters Meeting >Thesaurus-based named entity recognition system for detecting spatio-temporal crime events in Spanish language from Twitter
【24h】

Thesaurus-based named entity recognition system for detecting spatio-temporal crime events in Spanish language from Twitter

机译:基于同义词库的命名实体识别系统,用于从Twitter检测西班牙语的时空犯罪事件

获取原文

摘要

Social networks offer an invaluable amount of data from which useful information can be obtained on the major issues in society, among which crime stands out. Research about information extraction of criminal events in Social Networks has been done primarily in English language, while in Spanish, the problem has not been addressed. This paper propose a system for extracting spatio-temporally tagged tweets about crime events in Spanish language. In order to do so, it uses a thesaurus of criminality terms and a NER (named entity recognition) system to process the tweets and extract the relevant information. The NER system is based on the implementation OSU Twitter NLP Tools, which has been enhanced for Spanish language. Our results indicate an improved performance in relation to the most relevant tools such as Standford NER and OSU Twitter NLP Tools, achieving 80.95% precision, 59.65% recall and 68.69% F-measure. The end result shows the crime information broken down by place, date and crime committed through a webservice.
机译:社交网络提供了大量的数据,从中可以获取有关社会主要问题的有用信息,其中犯罪尤为突出。关于社交网络中犯罪事件信息提取的研究主要是用英语进行的,而在西班牙语中,该问题尚未解决。本文提出了一种提取西班牙语犯罪事件时空标记的推文的系统。为此,它使用犯罪术语词典和NER(命名实体识别)系统处理推文并提取相关信息。 NER系统基于OSU Twitter NLP工具的实现,该工具已针对西班牙语进行了增强。我们的结果表明,与最相关的工具(如Standford NER和OSU Twitter NLP工具)相比,性能有所改善,达到了80.95%的精度,59.65%的召回率和68.69%的F量度。最终结果显示了通过Web服务按地点,日期和犯罪分类的犯罪信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号