首页> 外文会议>International conference on computational processing of portuguese >Annotation of a Corpus of Tweets for Sentiment Analysis
【24h】

Annotation of a Corpus of Tweets for Sentiment Analysis

机译:推文语料库注释,用于情感分析

获取原文

摘要

This article describes the process of creation and annotation of a tweets corpus for Sentiment Analysis at sentence level. The tweets were captured using the #masterchefbr hashtag, in a tool to acquire the public stream of tweets in real time and then annotated based on the six basic emotions (joy, surprise, fear, sadness, disgust, anger) commonly used in the literature. The neutral tag was adopted to annotate sentences where there was no expressed emotion. At the end of the process, the measure of disagreement between annotators reached a Kappa value of 0.42. Some experiments with the SVM algorithm (Support Vector Machine) have been performed with the objective of submitting the annotated corpus to a classification process, to better understand the Kappa value of the corpus. An accuracy of 52.9% has been obtained in the classification process when using both discordant and concordant text within the corpus.
机译:本文在句子级别描述了用于情感分析的推文语料库的创建和注释过程。使用#masterchefbr标签来捕获这些推文,该工具可实时获取公共推文流,然后根据文献中常用的六种基本情感(欢乐,惊讶,恐惧,悲伤,厌恶,愤怒)进行注释。 。采用中性标签来注释没有表达情绪的句子。在该过程结束时,注释者之间的分歧程度达到了Kappa值为0.42。为了将带注释的语料库提交给分类过程,以更好地理解语料库的Kappa值,已经进行了一些使用SVM算法(支持向量机)的实验。在语料库中同时使用不一致和一致的文本时,在分类过程中的准确性达到52.9%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号