【24h】

Conversation dialog corpora from television and movie scripts

机译:电视和电影脚本中的对话对话框语料库

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Example-based dialogue systems often require natural conversation templates as examples for response generation. However, in previous work most conversation corpora have been created by hand and do not well portray actual conversations between two people. One way to overcome this problem is to record and transcribe real human-to-human conversation. However, this work is tedious and time consuming. In this work, we utilize conversation scripts from television and movies. We extract conversations from television and movie scripts from the web and perform various types of filtering. In order to ensure that the conversation is performed by two speakers, we introduce a unit of conversation called a tri-turn (a trigram conversation turn) which allow us to filter conversations with more than two speakers. In the end, our conversation corpora contains 86,719 query-response pairs that represent conversation turns performed by two speakers talking to each other.
机译:基于示例的对话系统通常需要自然对话模板作为生成响应的示例。但是,在以前的工作中,大多数对话语料库都是手工创建的,不能很好地描述两个人之间的实际对话。克服此问题的一种方法是记录并记录真正的人与人之间的对话。但是,这项工作繁琐且耗时。在这项工作中,我们利用电视和电影中的对话脚本。我们从网络上的电视和电影脚本中提取对话,并执行各种类型的过滤。为了确保由两个说话者进行对话,我们引入了一个称为“ Tri-turn”的对话单元(Trigram对话回合),它允许我们过滤与两个以上说话者的对话。最后,我们的会话语料库包含86,719个查询-响应对,它们表示两个说话者彼此交谈而执行的会话转向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号