首页> 外文会议>IEEE International Conference on Semantic Computing >Extraction of Semantic Relations in Noisy User-Generated Law Enforcement Data
【24h】

Extraction of Semantic Relations in Noisy User-Generated Law Enforcement Data

机译:嘈杂的用户生成的执法数据中的语义关系提取

获取原文

摘要

Relation extraction from text is a well-known and extensively studied topic in Natural Language Processing research. However, the implementation of relation extraction approaches in real-world application scenarios raises various methodological considerations which are often left implicit in existing research. This paper explores these considerations using a real-world dataset of user-generated police reports in Dutch. The use of linguistic features based on dependency trees is investigated, including an ablation analysis of the importance of individual features. The construction of negative examples for machine learning models is discussed, as well as the construction of a baseline model. The methodological implications of using a small dataset are discussed in terms of the design and performance of a Long Short Term Memory network as well as a Support Vector Machine. In general the models perform well, however the definition of the classification task, and in particular the construction of negative examples, are shown to have a large impact on classification accuracy and subsequently on the interpretation of the evaluation results.
机译:从文本中提取关系是自然语言处理研究中一个广为人知的主题。然而,在实际应用场景中关系提取方法的实现提出了各种方法上的考虑,而这些考虑通常在现有研究中是隐含的。本文使用荷兰语中用户生成的警方报告的真实数据集来探讨这些注意事项。研究了基于依存关系树的语言特征的使用,包括对单个特征重要性的消融分析。讨论了机器学习模型的负面示例的构建以及基线模型的构建。在长期短期记忆网络以及支持向量机的设计和性能方面,讨论了使用小型数据集的方法学含义。一般而言,模型表现良好,但是分类任务的定义,尤其是负面示例的构建,显示出对分类准确性和评估结果的解释有很大的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号