首页> 外文会议>International joint conference on natural language processing >Bingo at IJCNLP-2017 Task 4: Augmenting Data using Machine Translation for Cross-linguistic Customer Feedback Classification
【24h】

Bingo at IJCNLP-2017 Task 4: Augmenting Data using Machine Translation for Cross-linguistic Customer Feedback Classification

机译:Bingo在IJCNLP-2017上的任务4:使用机器翻译增强数据以进行跨语言客户反馈分类

获取原文

摘要

The ability to automatically and accurately process customer feedback is a necessity in the private sector. Unfortunately, customer feedback can be one of the most difficult types of data to work with due to the sheer volume and variety of services, products, languages, and cultures that comprise the customer experience. In order to address this issue, our team built a suite of classifiers trained on a four-language, multi-label corpus released as part of the shared task on "Customer Feedback Analysis" at IJCNLP 2017. In addition to standard text preprocessing, we translated each dataset into each other language to increase the size of the training datasets. Additionally, we also used word embeddings in our feature engineering step. Ultimately, we trained classifiers using Logistic Regression, Random Forest, and Long Short-Term Memory (LSTM) Recurrent Neural Networks. Overall, we achieved a Macro-Average F_(β=1) score between 48.7% and 56.0% for the four languages and ranked 3/12 for English, 3/7 for Spanish, 1/8 for French, and 2/7 for Japanese.
机译:自动和准确地处理客户反馈的能力是私营部门的必要条件。不幸的是,由于构成客户体验的服务,产品,语言和文化的数量和种类繁多,因此客户反馈可能是最难处理的数据类型之一。为了解决此问题,我们的团队构建了一套分类器,该分类器在IJCNLP 2017上作为“客户反馈分析”共享任务的一部分而发布,使用了四语言,多标签语料库进行训练。除了标准的文本预处理之外,我们还提供将每个数据集翻译成其他语言,以增加训练数据集的大小。此外,我们还在要素工程步骤中使用了词嵌入。最终,我们使用Logistic回归,随机森林和长短期记忆(LSTM)递归神经网络训练了分类器。总体而言,我们在四种语言中获得的平均平均F_(β= 1)得分在48.7%和56.0%之间,英语排名3/12,西班牙语排名3/7,法语排名1/8,法语排名2/7日本人。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号