Some methods to address the problem of unbalanced sentiment classification in an arabic context

机译：解决阿拉伯语环境中情感分类不平衡问题的一些方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The rise of social media (such as online web forums and social networking sites) has attracted interests to mining and analyzing opinions available on the web. The online opinion has become the object of studies in many research areas; especially that called “Opinion Mining and Sentiment Analysis”. Several interesting and advanced works were performed on few languages (in particular English). However, there were very few studies on some languages such as Arabic. This paper presents the study we have carried out to address the problem of unbalanced data sets in supervised sentiment classification in an Arabic context. We propose three different methods to under-sample the majority class documents. Our goal is to compare the effectiveness of the proposed methods with the common random under-sampling. We also aim to evaluate the behavior of the classifier toward different under-sampling rates. We use two different common classifiers, namely Naïve Bayes and Support Vector Machines. The experiments are carried out on an Arabic data set that we have built from Aljazeera's web site and labeled manually. The results show that Naïve Bayes is sensitive to data set size, the more we reduce the data the more the results degrade. However, it is not sensitive to unbalanced data sets on the contrary of Support Vector Machines which is highly sensitive to unbalanced data sets. The results show also that we can rely on the proposed techniques and that they are typically competitive with random under-sampling.

机译：社交媒体（例如在线Web论坛和社交网站）的兴起吸引了人们对挖掘和分析Web上可用观点的兴趣。在线意见已成为许多研究领域的研究对象。特别是所谓的“意见挖掘和情感分析”。在几种语言（尤其是英语）上进行了一些有趣且高级的作品。但是，对某些语言（如阿拉伯语）的研究很少。本文介绍了我们为解决阿拉伯语环境中有监督的情感分类中的不平衡数据集问题而进行的研究。我们提出了三种不同的方法来对大多数类别的文档进行欠采样。我们的目标是将所提出的方法与普通随机欠采样进行比较。我们还旨在评估分类器针对不同欠采样率的行为。我们使用两个不同的通用分类器，即朴素贝叶斯和支持向量机。实验是根据我们从Aljazeera网站建立的阿拉伯数据集进行的，并手动标记了标签。结果表明，朴素贝叶斯对数据集的大小敏感，我们减少数据的次数越多，结果的降级就越大。但是，它对不平衡数据集不敏感，而支持向量机对不平衡数据集高度敏感。结果还表明，我们可以依靠提出的技术，并且它们通常在随机欠采样方面具有竞争力。

著录项

来源
《2012 Colloquium in Information Science and Technology.》|2012年|p.43- 48|共6页
会议地点 Fez(MA);Fez(MA)
作者
Mountassir Asmaa; Benbrahim Houda; Berrada Ilham;
展开▼
作者单位

ALBIRONI Research Team, ENSIAS, Mohamed 5 University, Souissi, Rabat, Morocco;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;计算机的应用;
关键词

相似文献

外文文献
中文文献
专利

1. New under-sampling methods to address the problem of unbalanced sentiment classification: application on Arabic datasets [J] . Asmaa Mountassir, Houda Benbrahim, Ilham Berrada International Journal of Information and Communication Technology . 2016,第1期

机译：解决情感分类不平衡问题的新的欠采样方法：在阿拉伯数据集上的应用
2. Intelligent sentinet-based lexicon for context-aware sentiment analysis: optimized neural network for sentiment classification on social media [J] . Naresh Kumar K. E., Uma V. Journal of supercomputing . 2021,第11期

机译：基于智能的Sentinet基于背景感知情绪分析的词典：优化的神经网络在社交媒体上的情绪分类
3. Utilizing Arabic WordNet Relations in Arabic Text Classification: New Feature Selection Methods [J] . Suhad A. Yousif, Zainab N. Sultani, Venus W. Samawi IAENG Internaitonal journal of computer science . 2019,第4PTa645a776期

机译：利用阿拉伯文文本分类中的阿拉伯语Wordnet关系：新功能选择方法
4. Some methods to address the problem of unbalanced sentiment classification in an arabic context [C] . Mountassir Asmaa, Benbrahim Houda, Berrada Ilham Colloquium in Information Science and Technology . 2012

机译：解决阿拉伯语中不平衡情绪分类问题的一些方法
5. Improving Sentiment Classification for Arabic Short Text Using Deep Learning Approaches [D] . Alwehaibi, Ali. 2021

机译：利用深度学习方法改善阿拉伯语短文本的情感分类
6. Arabic Sentiment Classification Using Convolutional Neural Network and Differential Evolution Algorithm [O] . Abdelghani Dahou, Mohamed Abd Elaziz, Junwei Zhou, 2019

机译：基于卷积神经网络和差分进化算法的阿拉伯语情感分类
7. Negation Handling in Machine Learning-Based Sentiment Classification for Colloquial Arabic [O] . Omar Alharbi 2020

机译：基于机器学习的情绪的否定处理对口语阿拉伯语的语言分类

Some methods to address the problem of unbalanced sentiment classification in an arabic context

摘要

著录项

相似文献

相关主题

期刊订阅