...
首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >EnSWF: effective features extraction and selection in conjunction with ensemble learning methods for document sentiment classification
【24h】

EnSWF: effective features extraction and selection in conjunction with ensemble learning methods for document sentiment classification

机译:ENSWF:与文档情绪分类的集合学习方法结合的有效特征提取和选择

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

With the rise of web 2.0, a huge amount of unstructured data has been generated on regular basis in the form of comments, opinions, etc. This unstructured data contains useful information and can play a significant role in business decision making. In this context, sentiment analysis (SA) is an active research area and has recently attracted the attention of the research community. The aim of SA is to classify the user-generated content into positive and negative class. State-of-the-art techniques for sentiment classification relies on the traditional bag-of-words approaches. Such approaches can be advantageous in terms of simplicity but completely ignore the semantics aspects, the order between words, and also leads to the curse of dimensionality. Researchers have also proposed semantic-based SA techniques in conjunction with word-order employing high order n-grams, part-of-speech (POS) patterns, and dependency relation features. But can every word or phrase of high order n-grams, POS patterns or dependency relation features represent sentiment clue? If incorporated, then what about the dimensionality? In order to tackle and investigate such issues, in this paper, we propose a novel POS and n-gram based ensemble method for SA while considering semantics, sentiment clue, and order between words called EnSWF which is a four phase process. Our main contributions are four-fold (a) Appropriate Feature Extraction: we investigate and validate extracting various appropriate features for sentiment classification. (b) Dimensionality Reduction: We decrease the dimensionality of feature space by selecting the subset of most meaningful and effective features. (c) Ensemble Model: We propose an ensemble learning method for both filter based features selection and classification using simple majority voting technique. (d) Practicality: we authenticate our claim while applying our model on benchmark datasets. We also show that EnSWF out-perform existing techniques in terms of class
机译:随着Web 2.0的崛起,已经以评论,意见等的形式定期生成了大量的非结构化数据。这个非结构化数据包含有用的信息,并且可以在业务决策中发挥重要作用。在这种情况下,情绪分析(SA)是一个活跃的研究区,最近引起了研究界的注意。 SA的目标是将用户生成的内容分类为正类和负类。用于情感分类的最先进技术依赖于传统的词语方法。这种方法在简单性方面可以是有利的,但完全忽略了语义方面,单词之间的顺序,并且还导致维度的诅咒。研究人员还建议了与采用高阶N-GRAM的字样,词语(POS)模式和依赖关系特征结合的基于语义的SA技术。但是每一句高阶n-gram,pos模式或依赖关系特征都可以代表情绪线索?如果掺入,那么维数呢?为了解决和调查此类问题,在本文中,我们提出了一种新的POS和基于N-GRAM的集合方法,同时考虑了语义,情绪线索和称为ENSWF的单词的顺序,这是一个四相过程。我们的主要贡献是四倍(a)适当的特征提取:我们调查和验证提取各种适当特征进行情绪分类。 (b)减少维度:通过选择最有意义和有效的功能的子集,我们通过选择最有意义和有效的功能来降低特征空间的维度。 (c)合奏模型:我们提出了一种基于滤波器的特征学习方法,使用简单的多数表决技术进行了基于滤波器的特征选择和分类。 (d)实用性:我们在将模型应用于基准数据集时验证我们的索赔。我们还显示ENSWF在课堂上进行现有技术

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号