首页> 外文会议>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain
【24h】

SOCCER: An Information-Sparse Discourse State Tracking Collection in the Sports Commentary Domain

机译:足球:体育评论域中的信息稀疏话语状态跟踪集合

获取原文

摘要

In the pursuit of natural language understanding, there has been a long standing interest in tracking state changes throughout narratives. Impressive progress has been made in modeling the state of transaction-centric dialogues and procedural texts. However, this problem has been less intensively studied in the realm of general discourse where ground truth descriptions of states may be loosely defined and state changes are less densely distributed over utterances. This paper proposes to turn to simplified, fully observable systems that show some of these properties: Sports events. We curated 2,263 soccer matches including time-stamped natural language commentary accompanied by discrete events such as a team scoring goals, switching players or being penalized with cards. We propose a new task formulation where, given paragraphs of commentary of a game at different timestamps, the system is asked to recognize the occurrence of in-game events. This domain allows for rich descriptions of state while avoiding the complexities of many other real-world settings. As an initial point of performance measurement, we include two baseline methods from the perspectives of sentence classification with temporal dependence and current state-of-the-art generative model, respectively, and demonstrate that even sophisticated existing methods struggle on the state tracking task when the definition of state broadens or non-event chatter becomes prevalent.
机译:在追求自然语言的理解中,在叙事中追踪州的变化已经存在很长时间的兴趣。在建模以交易为中心的对话和程序文本的状态方面取得了令人印象深刻的进展。然而,在一般话语的领域中,这个问题的研究已经不太积极研究了各国的地面真理描述,并且可以松散地定义,并且状态变化不太密集地分布在话语上。本文建议转向简化,完全可观察的系统,展示其中一些属性:体育赛事。我们策划了2,263个足球比赛,包括时间戳的自然语言评论,伴随着分立的事件,如团队得分目标,切换球员或用卡惩罚。我们提出了一项新的任务制定,鉴于不同时间戳的游戏评论段落,系统被要求识别出现游戏内事件的发生。该域允许对状态的丰富说明,同时避免许多其他现实世界的复杂性。作为初始性能测量的初始点,我们包括两个基线方法,从句子分类的角度分别具有时间依赖性和当前的最先进的生成模型,并证明了甚至复杂的现有方法在状态跟踪任务时挣扎国家的定义扩大或非事件喋喋不休变得普遍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号