Assessing user simulation for dialog systems using human judges and automatic evaluation measures

H U A A I; DIANE LITMAN

首页> 外文期刊>Natural language engineering >Assessing user simulation for dialog systems using human judges and automatic evaluation measures

【24h】

Assessing user simulation for dialog systems using human judges and automatic evaluation measures

机译：使用人工判断和自动评估措施评估对话系统的用户模拟

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

While different user simulations are built to assist dialog system development, there is an increasing need to quickly assess the quality of the user simulations reliably. Previous studies have proposed several automatic evaluation measures for this purpose. However, the validity of these evaluation measures has not been fully proven. We present an assessment study in which human judgments are collected on user simulation qualities as the gold standard to validate automatic evaluation measures. We show that a ranking model can be built using the automatic measures to predict the rankings of the simulations in the same order as the human judgments. We further show that the ranking model can be improved by using a simple feature that utilizes time-series analysis.

机译：尽管构建了不同的用户仿真来辅助对话系统开发，但越来越需要快速可靠地评估用户仿真的质量。先前的研究为此提出了几种自动评估措施。但是，这些评估措施的有效性尚未得到充分证明。我们提供了一项评估研究，其中收集了用户对用户模拟质量的判断作为黄金标准，以验证自动评估措施。我们表明，可以使用自动测量来构建排名模型，以与人类判断相同的顺序预测模拟的排名。我们进一步表明，可以通过使用利用时间序列分析的简单功能来改善排名模型。

著录项

来源
《Natural language engineering》 |2011年第4期|p.511-540|共30页
作者
H U A A I; DIANE LITMAN;
展开▼
作者单位

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15260, USA;

Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15260, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Automatic creation of scenarios for evaluating spoken dialogue systems via user-simulation [J] . Lopez-Cozar Ramon Knowledge-Based Systems . 2016,第Auga15期

机译：通过用户模拟自动创建用于评估口语对话系统的方案
2. Real user evaluation of a POMDP spoken dialogue system using automatic belief compression [J] . Paul A. Crook, Simon Keizer, Zhuoran Wang, Computer speech and language . 2014,第4期

机译：使用自动信念压缩对POMDP口语对话系统进行真实用户评估
3. Data-driven user simulation for automated evaluation of spoken dialog systems [J] . Sangkeun Jung, Cheongjae Lee, Kyungduk Kim, Computer speech and language . 2009,第4期

机译：数据驱动的用户仿真，用于语音对话系统的自动评估
4. Assessing Dialog System User Simulation Evaluation Measures UsingHuman Judges [C] . Hua Ai, Diane J. Litman Association for Computational Linguistics Annual Meeting: Human Language Technologies;ACL-08: HLT . 2008

机译：使用人为评估工具评估对话系统用户模拟评估措施
5. Rapid prototyping and evaluation of dialogue systems for virtual humans [D] . Gandhe, Sudeep 2014

机译：虚拟人对话系统的快速原型制作和评估
6. System-level barriers to personal recovery in mental health: qualitative analysis of co-productive narrative dialogues between users and professionals [O] . Miharu Nakanishi, George Kurokawa, Junko Niimura, 2021

机译：心理健康中个人恢复的系统级障碍：用户和专业人士之间的共同生产叙事对话的定性分析
7. Towards a flexible user simulation for evaluating spoken dialogue systems [O] . Dmitry Butenkov 2009

机译：迈向灵活的用户模拟，用于评估口语对话系统

Assessing user simulation for dialog systems using human judges and automatic evaluation measures

摘要

著录项

相似文献

相关主题

期刊订阅