首页> 外文会议>International Conference on Computational Linguistics >Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models
【24h】

Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models

机译:探索社交媒体文本的Amharic Sentsiment分析:构建注释工具和分类模型

获取原文

摘要

This paper presents the study of sentiment analysis for Amharic social media texts. As the number of social media users is ever-increasing, social media platforms would like to understand the latent meaning and sentiments of a text to enhance decision-making procedures. However, low-resource languages such as Amharic have received less attention due to several reasons such as lack of well-annotated datasets, unavailability of computing resources, and fewer or no expert researchers in the area. This research addresses three main research questions. We first explore the suitability of existing tools for the sentiment analysis task. Annotation tools are scarce to support large-scale annotation tasks in Amharic. Also, the existing crowdsourcing platforms do not support Amharic text annotation. Hence, we build a social-network-friendly annotation tool called ' ASAB' using the Telegram bot. We collect 9.4k tweets, where each tweet is annotated by three Telegram users. Moreover, we explore the suitability of machine learning approaches for Amharic sentiment analysis. The FLAIR deep learning text classifier, based on network embed-dings that are computed from a distributional thesaurus, outperforms other supervised classifiers. We further investigate the challenges in building a sentiment analysis system for Amharic and we found that the widespread usage of sarcasm and figurative speech are the main issues in dealing with the problem. To advance the sentiment analysis research in Amharic and other related low-resource languages, we release the dataset, the annotation tool, source code, and models publicly under a permissive.
机译:本文介绍了Amharic社交媒体文本的情感分析研究。随着社交媒体用户的数量越来越多,社交媒体平台希望了解文本的潜在含义和情绪,以加强决策程序。然而,由于诸如缺乏注释的数据集,计算资源不可用的几种原因,诸如Amharic等低资源语言感到不那么关注,以及该地区的专家研究人员。这项研究解决了三个主要的研究问题。我们首先探讨了现有工具对情感分析任务的适用性。注释工具稀缺以支持Amharic的大规模注释任务。此外,现有的众包平台不支持Amharic Text注释。因此,我们使用电报机器人构建一个名为'ASAB'的社交网络友好的注释工具。我们收集9.4k推文,每个推文都被三个电报用户注释了。此外,我们探讨了机器学习方法对Amharic Senfisimance分析的适用性。基于从分布词库计算的网络嵌入叮当的Flair深度学习文本分类器优于其他监督分类器。我们进一步调查了对Amharic的情感分析系统建立情感分析系统的挑战,我们发现讽刺和比喻演讲的广泛使用是处理问题的主要问题。为了推进AMHARIC和其他相关低资源语言的情感分析研究,我们在允许的情况下公开发布数据集,注释工具,源代码和模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号