Optimized Transformer Models for FAQ Answering

机译：针对常见问题解答而优化的变压器模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Informational chatbots provide a highly effective medium for improving operational efficiency in answering customer queries for any enterprise. Chatbots are also preferred by users/customers since unlike other alternatives like calling customer care or browsing over FAQ pages, chatbots provide instant responses, are easy to use, are less invasive and are always available. In this paper, we discuss the problem of FAQ answering which is central to designing a retrieval-based informational chatbot. Given a set of FAQ pages s for an enterprise, and a user query, we need to find the best matching question-answer pairs from s. Building such a semantic ranking system that works well across domains for large QA databases with low runtime and model size is challenging. Previous work based on feature engineering or recurrent neural models either provides low accuracy or incurs high runtime costs. We experiment with multiple transformer based deep learning models, and also propose a novel MT-DNN (Multi-task Deep Neural Network)-based architecture, which we call Masked MT-DNN (or MMT-DNN). MMT-DNN significantly outperforms other state-of-the-art transformer models for the FAQ answering task. Further, we propose an improved knowledge distillation component to achieve ～2.4x reduction in model-size and ～7x reduction in runtime while maintaining similar accuracy. On a small benchmark dataset from SemEval 2017 CQA Task 3, we show that our approach provides an NDCG@1 of 83.1. On another large dataset of ～281K instances corresponding to ～30K queries from diverse domains, our distilled 174 MB model provides an NDCG@1 of 75.08 with a CPU runtime of mere 31 ms establishing a new state-of-the-art for FAQ answering.

机译：信息聊天机器人提供了一种高效的媒介，可以提高任何企业在回答客户查询时的运营效率。聊天机器人还受到用户/客户的青睐，因为与其他选择（如致电客户服务或在FAQ页面上浏览）不同，聊天机器人可提供即时响应，易于使用，侵入性较小且始终可用。在本文中，我们讨论了FAQ回答问题，这对于设计基于检索的信息聊天机器人至关重要。给定企业的FAQ页面s和用户查询，我们需要从中找到最匹配的问题-答案对。对于运行时间短且模型规模小的大型QA数据库而言，如何构建一种跨域良好运行的语义排名系统具有挑战性。基于特征工程或递归神经模型的先前工作要么准确性低，要么招致高昂的运行时间成本。我们尝试了多种基于变压器的深度学习模型，并提出了一种新颖的基于MT-DNN（多任务深度神经网络）的架构，我们将其称为Masked MT-DNN（或MMT-DNN）。 MMT-DNN在常见问题解答任务方面明显优于其他最新的变压器模型。此外，我们提出了一种改进的知识蒸馏组件，可在保持相似精度的同时，将模型大小减少约2.4倍，将运行时间减少约7倍。在SemEval 2017 CQA任务3的小型基准数据集上，我们证明了我们的方法提供的NDCG @ 1为83.1。在对应来自不同域的30,000个查询的〜281K实例的另一个大型数据集上，我们的精炼174 MB模型提供了NDCG @ 1为75.08，CPU运行时间仅为31 ms，为FAQ回答建立了新的技术水平。

著录项

来源
《Pacific-Asia Conference on Knowledge Discovery and Data Mining》|2020年|235-248|共14页
会议地点
作者
Sonam Damani; Kedhar Nath Narahari; Ankush Chatterjee; Manish Gupta; Puneet Agrawal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. YOUR CORONAVIRUS FAQS ANSWERED [J] . Peter Harrison, Jess Gallacher The Lighting Journal . 2020,第6期

机译：你的coronavirus常见问题解答答案
2. Optrel's CEO Grant Cooper answers a few FAQs [J] . Snips . 2019,第6期

机译：Optrel的首席执行官Grant Cooper回答了一些常见问题解答
3. Optrel's CEO Grant Cooper answers a few FAQs [J] . Snips . 2019,第6期

机译：Optrel的首席执行官Granc Cooper回答了几个常见问题解答
4. The Model Research of FAQ Answering System Based on Concept [C] . Zhang Maoyuan, He Tingting, Yang Fuquan International Symposium on Knowledge Acquisition and Modeling;KAM '09 . 2009

机译：基于概念的常见问题解答系统模型研究
5. Multi-objective optimization and meta-modeling of tape-wound transformers. [D] . Taher, Ahmed. 2014

机译：带绕变压器的多目标优化和元建模。
6. Modelling and Optimization of Four-Segment Shielding Coils of Current Transformers [O] . Yucheng Gao, Wei Zhao, Qing Wang, 2017

机译：电流互感器四段屏蔽线圈的建模与优化
7. Optimized Transformer Models for FAQ Answering [O] . Sonam Damani, Kedhar Nath Narahari, Ankush Chatterjee, 2020

机译：常见问题解答应答的优化变压器模型
8. Satellite Data of Atmospheric Pollution for U.S. Air Quality Applications: Examples of Applications, Summary of Data End-user Resources, Answers to Faqs, and Common Mistakes to Avoid. [R] . Duncan, B. N., Prados, A., Burton, S. P., 2014

机译：美国空气质量应用的大气污染卫星数据：应用实例，数据最终用户资源摘要，常见问题解答以及要避免的常见错误。

Optimized Transformer Models for FAQ Answering

摘要

著录项

相似文献

相关主题

期刊订阅