首页> 外文期刊>Neurocomputing >Combine HowNet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis
【24h】

Combine HowNet lexicon to train phrase recursive autoencoder for sentence-level sentiment analysis

机译:结合HowNet词典来训练短语递归自动编码器以进行句子级情感分析

获取原文
获取原文并翻译 | 示例

摘要

Detecting sentiment of sentences in online reviews is still a challenging task. Traditional machine learning methods often use bag-of-words representations which cannot properly capture complex linguistic phenomena in sentiment analysis. Recently, recursive autoencoder (RAE) methods have been proposed for sentence-level sentiment analysis. They use word embedding to represent each word, and learn compositional vector representation of phrases and sentences with recursive autoencoders. Although RAE methods outperform other state-of-the-art sentiment prediction approaches on commonly used datasets, they tend to generate very deep parse trees, and need a large amount of labeled data for each node during the process of learning compositional vector representations. Furthermore, RAE methods mainly combine adjacent words in sequence with a greedy strategy, which make capturing semantic relations between distant words difficult. To solve these issues, we propose a semi-supervised method which combines HowNet lexicon to train phrase recursive autoencoders (we call it CHL-PRAE). CHL-PRAE constructs the phrase recursive autoencoder (PRAE) model at first. Then the model calculates the sentiment orientation of each node with the HowNet lexicon, which acts as sentiment labels, when we train the softmax classifier of PRAE. Furthermore, our CHL-PRAE model conducts bidirectional training to capture global information. Compared with RAE and some supervised methods such as support vector machine (SVM) and naive Bayesian on English and Chinese datasets, the experiment results show that CHL-PRAE can provide the best performance for sentence-level sentiment analysis. (C) 2017 Elsevier B.V. All rights reserved.
机译:在在线评论中检测句子的情感仍然是一项艰巨的任务。传统的机器学习方法经常使用词袋表示法,无法在情感分析中正确捕捉复杂的语言现象。最近,已经提出了递归自动编码器(RAE)方法用于句子级情感分析。他们使用词嵌入来表示每个词,并使用递归自动编码器学习短语和句子的组成矢量表示。尽管RAE方法在常用数据集上的表现优于其他最新的情感预测方法,但它们倾向于生成非常深的解析树,并且在学习合成矢量表示的过程中每个节点都需要大量标记数据。此外,RAE方法主要将相邻的单词与贪婪策略相结合,这使得获取遥远单词之间的语义关系变得困难。为了解决这些问题,我们提出了一种半监督方法,该方法结合了HowNet词典来训练短语递归自动编码器(我们将其称为CHL-PRAE)。 CHL-PRAE首先构建短语递归自动编码器(PRAE)模型。然后,当我们训练PRAE的softmax分类器时,该模型使用HowNet词典计算每个节点的情感方向,该词汇网用作情感标签。此外,我们的CHL-PRAE模型进行双向培训以捕获全球信息。与RAE以及英语和中文数据集上的一些监督方法(如支持向量机(SVM)和朴素贝叶斯方法)相比,实验结果表明CHL-PRAE可以为句子级情感分析提供最佳性能。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2017年第7期|18-27|共10页
  • 作者单位

    Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China;

    Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China;

    Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China;

    Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Sentiment analysis; Recursive autoencoder; HowNet lexicon; Phrase structure tree;

    机译:情感分析递归自编码器HowNet词典短语结构树;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号