首页> 外文会议>International Conference on Web Information Systems and Technologies >SATALex: Telecom Domain-specific Sentiment Lexicons for Egyptian and Gulf Arabic Dialects
【24h】

SATALex: Telecom Domain-specific Sentiment Lexicons for Egyptian and Gulf Arabic Dialects

机译:SataLex:电信域特定情绪词典为埃及和海湾阿拉伯语方言

获取原文

摘要

Given the sacristy of the Arabic sentiment lexicon especially for the Egyptian and Gulf dialects, together with the fact that a word's sentiment depends mostly on the domain in which it is used, we present SATALex which is a two-part sentiment lexicon covering the telecom domain for the Egyptian and Gulf Arabic dialects. The Egyptian sentiment lexicon contains close to 1.5 thousand Egyptian words and compound phrases, while the Gulf sentiment lexicon contains close to 3.5 thousand Gulf words and compound phrases. The development of the presented lexicons has taken place iteratively, in each iteration manual annotators analyzed tweets for the corresponding dialect to try to extract as many domain specific words as possible and measure their effect on the performance of the classification. The result are lexicons which are more focused and related to the telecom domain more than any translated or general-purpose sentiment lexicon. To demonstrate the effectiveness of these built lexicons and how directly they can impact the task of sentiment analysis, we compared their performance to one of the biggest publicly available sentiment lexicon (WeightedNileULex) using Semantic Orientation (SO) approach on telecom test datasets; one for each dialect. The experiments show that using SATALex lexicons improved the results over the publicly available lexicon.
机译:鉴于阿拉伯语情绪词典的祭祀尤其是埃及和海湾方言,以及一个词的情绪主要取决于它的使用域,我们呈现SataLex,这是一个覆盖电信域的两部分情绪词典对于埃及和海湾阿拉伯语方言。埃及情绪词典含有接近的1.5万埃及词语和复合短语,而海湾情绪词典含有接近3.5万张海湾词和复合短语。透析的词典的开发已经迭代地进行,在每次迭代手册中,注释器分析了对应方言的推文,尝试尽可能多地提取多个域特定单词,并测量它们对分类性能的影响。结果是lexicons比任何翻译或通用情绪词典更加集中和与电信域相关的词汇。为了展示这些建造的词典的有效性以及如何直接影响情绪分析的任务,我们将其性能与电信测试数据集上的语义定向(SO)方法相比,他们的性能与最大的公共可读性情绪词典(加权尼拿塞克)进行了比较;每个方言一个。实验表明,使用Satalex Lexicons通过公开可用的Lexicon改善结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号