首页> 外文期刊>Procedia Computer Science >Sentiment lexicon for sentiment analysis of Saudi dialect tweets
【24h】

Sentiment lexicon for sentiment analysis of Saudi dialect tweets

机译:用于沙特方言推文情感分析的情感词典

获取原文
           

摘要

Twitter is one of the most widely used social media platforms in Saudi Arabia and is a rich source for mining the public’s attitude towards political, social, and economic matters. Sentiment analysis is a technique used for identifying the polarity (positive, negative, or neutral) of a given tweet, using either machine learning approaches or sentiment lexicons. This paper presents two resources. The first is the Saudi dialect sentiment lexicon (SauDiSenti), which is a sentiment lexicon for sentiment analysis of Saudi dialect tweets. SauDiSenti comprises 4431 words and phrases from modern standard Arabic (MSA) and Saudi dialects manually extracted from a previously labelled dataset of tweets obtained from trending hashtags in Saudi Arabia. The second is a testing dataset comprising 1500 tweets evenly distributed over three classes: positive, negative, and neutral. To evaluate the performance of SauDiSenti, we used precision, recall, and F measure and compared it to AraSenTi—a larger Arabic sentiment dictionary. The data suggest that AraSenTi outperforms SauDiSenti only when both positive and negative tweets are considered, whereas SauDiSenti outperforms AraSenTi when positive, negative, and neutral tweets are considered. Despite the small size of SauDiSenti, its use for sentiment analysis of Saudi dialect tweets shows promising results in comparison to the automatically constructed larger dictionary AraSenTi. SauDiSenti and the testing dataset are available for download athttp://corpus.kacst.edu.sa/more_info.jsp.
机译:Twitter是沙特阿拉伯使用最广泛的社交媒体平台之一,并且是挖掘公众对政治,社会和经济事务态度的丰富资源。情感分析是一种使用机器学习方法或情感词典来识别给定鸣叫的极性(正,负或中性)的技术。本文介绍了两种资源。首先是沙特方言情感词典(SauDiSenti),这是用于沙特方言推文情感分析的情感词典。 SauDiSenti包含从现代标准阿拉伯语(MSA)和沙特方言中提取的4431个单词和短语,这些单词和短语是从先前标记的推文数据集中手动提取的,这些数据集是从沙特阿拉伯的热门标签获得的。第二个是测试数据集,包含1500条推文,这些推文平均分布在三个类别上:正,负和中性。为了评估SauDiSenti的性能,我们使用了精度,召回率和F度量,并将其与AraSenTi(一种更大的阿拉伯语情感词典)进行了比较。数据表明,仅在考虑正面和负面推文时,AraSenTi的性能就优于SauDiSenti;而在考虑正面,负面和中性推文时,SauDiSenti的性能均优于AraSenTi。尽管SauDiSenti的体积很小,但与自动构建的较大的词典AraSenTi相比,其用于沙特方言推文情感分析的结果显示出了可喜的结果。 SauDiSenti和测试数据集可从http://corpus.kacst.edu.sa/more_info.jsp下载。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号