【24h】

Adapting Coreference Resolution for Processing Violent Death Narratives

机译:适应加工暴力死亡叙述的COSEREDED解决方案

获取原文

摘要

Coreference resolution is an important component in analyzing narrative text from administrative data (e.g., clinical or police sources). However, existing coreference models trained on general language corpora suffer from poor transferability due to domain gaps, especially when they are applied to gender-inclusive data with lesbian, gay, bisexual, and transgender (LGBT) individuals. In this paper, we analyzed the challenges of coreference resolution in an exemplary form of administrative text written in English: violent death narratives from the USA's Centers for Disease Control's (CDC) National Violent Death Reporting System. We developed a set of data augmentation rules to improve model performance using a probabilistic data programming framework. Experiments on narratives from an administrative database, as well as existing gender-inclusive coreference datasets, demonstrate the effectiveness of data augmentation in training coreference models that can better handle text data about LGBT individuals.
机译:Coreference Degeter是分析行政数据的叙事文本(例如,临床或警察来源)的重要组成部分。然而,在一般语言集团上培训的现有Coreference模型因域差距而受到差的可转移性,特别是当它们适用于与女同性恋,同性恋,双性恋和变性人(LGBT)个人的性别数据。在本文中,我们分析了用英语编写的示例性行政文本的努力决议的挑战:来自美国的疾病控制(CDC)国家暴力死亡报告系统的暴力死亡叙述。我们开发了一组数据增强规则,以利用概率数据编程框架来提高模型性能。来自管理数据库的叙述以及现有的性别包容性Coreference数据集的实验证明了数据增强在培训Coreference模型中的有效性,可以更好地处理关于LGBT个人的文本数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号