SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis

机译：SentiALG：用于阿尔及利亚情绪分析的自动语料库注释

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data annotation is an important but time-consuming and costly procedure. To sort a text into two classes, the very first thing we need is a good annotation guideline, establishing what is required to qualify for each class. In the literature, the difficulties associated with an appropriate data annotation has been underestimated. In this paper, we present a novel approach to automatically construct an annotated sentiment corpus for Algerian dialect (A Maghrebi Arabic dialect). The construction of this corpus is based on an Algerian sentiment lexicon that is also constructed automatically. The presented work deals with the two widely used scripts on Arabic social media: Arabic and Arabizi. The proposed approach automatically constructs a sentiment corpus containing 8000 messages (where 4000 are dedicated to Arabic and 4000 to Arabizi). The achieved F1-score is up to 72% and 78% for an Arabic and Arabizi test sets, respectively. Ongoing work is aimed at integrating transliteration process for Arabizi messages to further improve the obtained results.

机译：数据注释是一个重要但耗时且昂贵的过程。要将文本分为两类，我们需要的第一件事是一个好的注释准则，该准则确定了每个类别要具备的条件。在文献中，与适当的数据注释相关的困难被低估了。在本文中，我们提出了一种新颖的方法，可以自动为阿尔及利亚方言（Maghrebi阿拉伯方言）构建带注释的情感语料库。该语料库的构建基于同样自动构建的阿尔及利亚情感词典。呈现的作品涉及阿拉伯语社交媒体上两个广泛使用的脚本：阿拉伯语和阿拉伯语。所提出的方法自动构建包含8000条消息的情感语料库（其中4000条专用于阿拉伯语，4000条专用于阿拉伯语）。对于阿拉伯语和阿拉伯语测试集，达到的F1分数分别高达72％和78％。正在进行的工作旨在整合阿拉伯语消息的音译过程，以进一步改善获得的结果。

著录项

来源
《International conference on brain-inspired cognitive systems》|2018年|557-567|共11页
会议地点
作者
Imane Guellil; Ahsan Adecl; Faical Azouaou; Amir Hussain;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Arabic sentiment analysis; Algerian dialect; Sentiment lexicon; Sentiment corpus; Sentiment classification;

机译：阿拉伯语情绪分析;阿尔及利亚方言情感词典;情感语料库;情感分类;

相似文献

外文文献
中文文献
专利

1. Sentiment and Behaviour Annotation in a Corpus of Dialogue Summaries [J] . Norton Trevisan Roman, Paul Piwek, Ariadne Maria Brito Rizzoni Carvalho, Journal of Universal Computer Science . 2015,第4期

机译：对话摘要语料库中的情感和行为注释
2. Evaluating and automating the annotation of a learner corpus [J] . Alexandr Rosen, Jirka Hana, Barbora Stindlova, Language Resources and Evaluation . 2014,第1期

机译：评估和自动化学习者语料库的注释
3. Corpus Analysis and Annotation for Helpful Sentences in Product Reviews [J] . Hana Almagrabi, Areej Malibari, John McNaught Computer and information science . 2018,第2期

机译：产品评论中有用句的语料库分析和注释
4. An Automated Corpus Annotation Experiment in Brazilian Portuguese for Sentiment Analysis in Public Security [C] . Victor Diogho Heuer de Carvalho, Thyago Celso Cavalcante Nepomuceno, Ana Paula Cabral Seixas Costa International conference on decision support systems technology . 2020

机译：巴西葡萄牙语的自动语料库注释实验，用于公共安全中的情感分析
5. Sentiments, Networks, Literary Biography: Towards a Mesoanalysis of Cicero's Corpus [D] . Marley, Caitlin A. 2018

机译：情绪，网络，文学传记：朝着西塞罗语料库的中分析
6. A versatile framework for resource-limited sentiment articulation annotation and analysis of short texts [O] . Vuk Batanović, Miloš Cvetanović, Boško Nikolić 2020

机译：用于资源有限的情感注释和短文本分析的多功能框架
7. BCSAT : A Benchmark Corpus for Sentiment Analysis in Telugu Using Word-level Annotations [O] . Sreekavitha Parupalli, Vijjini Anvesh Rao, Radhika Mamidi 2018

机译：BCSAT：使用Word-Level注释的Telugu在Teludu的情感分析基准语料库

SentiALG: Automated Corpus Annotation for Algerian Sentiment Analysis

摘要

著录项

相似文献

相关主题

期刊订阅