首页> 外文会议>Conference on empirical methods in natural language processing >A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining
【24h】

A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining

机译:面向方面的观点挖掘的多语言注释数据集

获取原文

摘要

We present the Trip-MAML dataset, a Multi-Lingual dataset of hotel reviews that have been manually annotated at the sentence-level with Multi-Aspect sentiment labels. This dataset has been built as an extension of an existent English-only dataset, adding documents written in Italian and Spanish. We detail the dataset construction process, covering the data gathering, selection, and annotation. We present inter-annotator agreement figures and baseline experimental results, comparing the three languages. Trip-MAML is a multi-lingual dataset for aspect-oriented opinion mining that enables researchers (ⅰ) to face the problem on languages other than English and (ⅱ) to the experiment the application of cross-lingual learning methods to the task.
机译:我们介绍了Trip-MAML数据集,这是一个酒店评论的多语言数据集,该数据集已在句子级别使用多方面情感标签进行了手动注释。该数据集已被构建为现有的仅英语数据集的扩展,增加了以意大利语和西班牙语编写的文档。我们详细介绍了数据集的构建过程,涵盖了数据收集,选择和注释。我们提供了注释者之间的协议数字和基准实验结果,比较了这三种语言。 Trip-MAML是面向方面的观点挖掘的多语言数据集,使研究人员(ⅰ)可以使用英语以外的其他语言来面对问题,并且(ⅱ)可以尝试将跨语言学习方法应用于任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号