首页> 外文会议>International Workshop on Computational Processing of the Portuguese Language >Development of a Brazilian Portuguese Hotel's Reviews Corpus
【24h】

Development of a Brazilian Portuguese Hotel's Reviews Corpus

机译:巴西葡萄牙酒店的发展评论语料库

获取原文

摘要

The provision of voluntary textual information mediated by the Internet, and particularly by Web 2.0, provided an opportunity for the creation of large linguistic corpora. These corpora can serve as a fundamental resource for the development of applications focused on natural language, especially those using deep learning techniques that require big datasets. One type of application that benefits from these resources is the ones that perform sentiment analysis. This article describes the creation of corpus aimed to support sentiment analysis applications. It consists of reviews hotels located in the Brazilian capitals and the Federal District, written in Brazilian Portuguese language. The reviews that make up the corpus have been taken from TripAdvisor and have undergone normalization and POS tagging. The primary goal is to make it available to the community to be used in machine learning tasks geared toward natural language.
机译:提供互联网介绍的自愿文本信息,特别是Web 2.0,为创建大型语言学生提供了机会。这些公司可以作为开发专注于自然语言的应用的基本资源,尤其是使用需要大数据集的深度学习技术的应用程序的基本资源。从这些资源中获益的一种应用程序是执行情感分析的应用程序。本文介绍了旨在支持情绪分析应用程序的语料库的创建。它包括点评酒店位于巴西葡萄牙语中的巴西海方,和联邦区。已从TripAdvisor采用构成语料库的审查,并经历了正常化和POS标记。主要目标是使社区可用于用于对自然语言的机器学习任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号