首页> 外文会议>Annual meeting of the special interest group on discourse and dialogue >How Would You Say It? Eliciting Lexically Diverse Data for Supervised Semantic Parsing
【24h】

How Would You Say It? Eliciting Lexically Diverse Data for Supervised Semantic Parsing

机译:你会怎么说?筛选词汇多样的数据以进行监督的语义分析

获取原文

摘要

Building dialogue interfaces for real-world scenarios often entails training semantic parsers starting from zero examples How can we build datasets that better capture the variety of ways users might phrase their queries, and what queries are actually realistic? Wang et al (2015) proposed a method to build semantic parsing datasets by generating canonical utterances using a grammar and having crowdworkers paraphrase them into natural wording. A limitation of this approach is that it induces bias towards using similar language as the canonical utterances. In this work, we present a methodology that elicits meaningful and lexically diverse queries from users for semantic parsing tasks. Starting from a seed lexicon and a generative grammar, we pair logical forms with mixed text-image representations and ask crowdworkers to paraphrase and confirm the plausibility of the queries that they generated. We use this method to build a semantic parsing dataset from scratch for a dialog agent in a smart-home simulation. We find evidence that this dataset, which we have named SmartHome, is demon-strably more lexically diverse and difficult to parse than existing domain-specific semantic parsing datasets.
机译:为现实世界的场景构建对话界面通常需要从零个示例开始训练语义解析器。我们如何构建可以更好地捕获用户表达其查询短语方式的数据集,以及哪些查询实际上是现实的? Wang等人(2015)提出了一种通过使用语法生成规范话语并让众包人员将其释义为自然措词来构建语义解析数据集的方法。这种方法的局限性在于它会导致偏向于使用与规范话语相似的语言。在这项工作中,我们提出了一种方法,该方法可从用户那里引发有意义的,词法多样的查询,以进行语义解析任务。从种子词典和生成语法开始,我们将逻辑形式与混合的文本图像表示形式配对,并要求众筹人员释义并确认他们生成的查询的合理性。我们使用此方法从头开始为智能家居模拟中的对话框代理构建语义解析数据集。我们发现有证据表明,与现有的特定领域语义分析数据集相比,该数据集(我们命名为SmartHome)在词法上更具多样性,并且难以解析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号