首页> 外文会议>IEEE International Conference on Semantic Computing >Extraction of Semantic Relations in Noisy User-Generated Law Enforcement Data
【24h】

Extraction of Semantic Relations in Noisy User-Generated Law Enforcement Data

机译:在嘈杂的用户生成的执法数据中提取语义关系

获取原文

摘要

Relation extraction from text is a well-known and extensively studied topic in Natural Language Processing research. However, the implementation of relation extraction approaches in real-world application scenarios raises various methodological considerations which are often left implicit in existing research. This paper explores these considerations using a real-world dataset of user-generated police reports in Dutch. The use of linguistic features based on dependency trees is investigated, including an ablation analysis of the importance of individual features. The construction of negative examples for machine learning models is discussed, as well as the construction of a baseline model. The methodological implications of using a small dataset are discussed in terms of the design and performance of a Long Short Term Memory network as well as a Support Vector Machine. In general the models perform well, however the definition of the classification task, and in particular the construction of negative examples, are shown to have a large impact on classification accuracy and subsequently on the interpretation of the evaluation results.
机译:文本的关系提取是一种众所周知的和广泛研究的自然语言处理研究。然而,实际应用方案中关系提取方法的实施提出了各种方法论考虑,这些考虑通常在现有研究中留下隐含。本文探讨了使用荷兰用户生成的警察报告的真实世界数据集的这些考虑因素。研究了基于依赖树的语言特征的使用,包括对个别特征的重要性的消融分析。讨论了用于机器学习模型的负例的构建,以及基线模型的构建。在长短短期存储器网络的设计和性能和支持向量机方面,讨论使用小型数据集的方法论含义。通常,模型表现良好,但是分类任务的定义,特别是否定例子的构造,被证明对分类精度具有很大影响,随后对评估结果的解释。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号