首页> 外文期刊>Computers and Electrical Engineering >A novel sentiment aware dictionary for multi-domain sentiment classification
【24h】

A novel sentiment aware dictionary for multi-domain sentiment classification

机译:一种新的情绪意识到的多域情感分类词典

获取原文
获取原文并翻译 | 示例
       

摘要

Sentiment Analysis is a sub area of Natural Language Processing (NLP) which extracts user's opinion and classifies it according to its polarity. This task has many applications but it is domain dependent and a costly task to annotate the corpora in every possible domain of interest before training the classifier. We are making an attempt to solve this problem by creating a sentiment aware dictionary using multiple domain data. This dictionary is created using labeled data from the source domain and unlabeled data from both source and target domains. Next, this dictionary is used to classify the unlabeled reviews of the target domain. The work is carried out in Hindi, the official language of India. The web pages in Hindi language is booming after the introduction of UTF-8 encoding style. When compared with labeling done by Hindi Sentiwordnet (HSWN), a general lexicon for word polarity, the proposed method is able to label 23-24% more number of words of target domain. The labels assigned by our method and the labels given by HSWN, for the available words, are compared and found matching with 76% accuracy. (C) 2017 Elsevier Ltd. All rights reserved.
机译:情绪分析是自然语言处理的子区域(NLP),其提取用户的意见并根据其极性对其进行分类。此任务具有许多应用程序,但它是域依赖性和昂贵的任务,以在培训分类器之前在每个可能的感兴趣的域中注释语料库。我们正在尝试使用多个域数据创建情绪意识词典来解决这个问题。使用来自源域的标记数据和来自两个源域和目标域的未标记数据创建此词典。接下来,此字典用于对目标域的未标记审核进行分类。这项工作是印度官方语言印地语的。在引入UTF-8编码风格后,印地语语言的网页正在蓬勃发展。与由Hindi SentiWordNet(HSWN)完成的标签相比,一个用于Word极性的通用词汇,所提出的方法能够标记23-24%的目标域的单词。使用我们的方法和HSWN给出的标签,用于可用单词的标签,并找到匹配,精度为76%。 (c)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号