首页> 外文会议>9th International conference on language resources and evaluation >Computational Narratology: Extracting Tense Clusters from Narrative Texts
【24h】

Computational Narratology: Extracting Tense Clusters from Narrative Texts

机译:计算叙事学:从叙事文本中提取时态簇

获取原文

摘要

Computational Narratology is an emerging field within the Digital Humanities. In this paper, we tackle the problem of extracting temporal information as a basis for event extraction and ordering, as well as further investigations of complex phenomena in narrative texts. While most existing systems focus on news texts and extract explicit temporal information exclusively, we show that this approach is not feasible for narratives. Based on tense information of verbs, we define temporal clusters as an annotation task and validate the annotation schema by showing that the task can be performed with high inter-annotator agreement. To alleviate and reduce the manual annotation effort, we propose a rule-based approach to robustly extract temporal clusters using a multi-layered and dynamic NLP pipeline that combines off-the-shelf components in a heuristic setting. Comparing our results against human judgements, our system is capable of predicting the tense of verbs and sentences with very high reliability: for the most prevalent tense in our corpus, more than 95% of all verbs are annotated correctly.
机译:计算叙事学是数字人文学科中的一个新兴领域。在本文中,我们解决了提取时间信息作为事件提取和排序基础的问题,并进一步研究了叙事文本中的复杂现象。尽管大多数现有系统专注于新闻文本并专门提取明显的时间信息,但我们证明,这种方法不适用于叙事。基于动词的时态信息,我们将时间簇定义为注释任务,并通过证明该任务可以在较高的注释者之间达成共识的情况下验证注释模式。为了减轻和减少手动注释的工作量,我们提出了一种基于规则的方法,该方法使用多层动态NLP管道(在启发式设置中结合了现成的组件)来稳健地提取时间集群。将我们的结果与人类的判断进行比较,我们的系统能够以非常高的可靠性预测动词和句子的时态:对于我们语料库中最普遍的时态,正确地注释了所有动词的95%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号