首页> 外文会议>Conference on empirical methods in natural language processing >Embedding Individual Table Columns for Resilient SQL Chatbots
【24h】

Embedding Individual Table Columns for Resilient SQL Chatbots

机译:为弹性SQL Chatbots嵌入各个表列

获取原文

摘要

Most of the world's data is stored in relational databases. Accessing these requires specialized knowledge of the Structured Query Language (SQL), putting them out of the reach of many people. A recent research thread in Natural Language Processing (NLP) aims to alleviate this problem by automatically translating natural language questions into SQL queries. While the proposed solutions are a great start, they lack robustness and do not easily generalize: the methods require high quality descriptions of the database table columns, and the most widely used training dataset, WikiSQL, is heavily biased towards using those descriptions as part of the questions.In this work, we propose solutions to both problems: we entirely eliminate the need for column descriptions, by relying solely on their contents, and we augment the WikiSQL dataset by paraphrasing column names to reduce bias. We show that the accuracy of existing methods drops when trained on our augmented, column-agnostic dataset, and that our own method reaches state of the art accuracy, while relying on column contents only.
机译:世界上大多数数据存储在关系数据库中。访问这些需要专门的结构化查询语言(SQL),将它们放在许多人的范围之外。最近在自然语言处理中的研究线程(NLP)旨在通过将自然语言问题自动翻译成SQL查询来缓解此问题。虽然提出的解决方案是一个很好的开始,但它们缺乏鲁棒性,不容易概括:这些方法需要高质量的数据库表列的描述,并且最广泛使用的训练数据集WikiSQL非常偏向于使用这些描述作为一部分问题所在的问题,我们向两个问题提出了解决方案:我们完全消除了对列描述的需求,通过依赖于他们的内容,我们通过释放列名来增强WikiSQL数据集以减少偏差。我们展示现有方法的准确性在我们的增强,列 - 不可行的数据集上培训时丢弃,并且我们自己的方法达到了最先进的准确性,同时仅依赖于列内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号