首页> 外文期刊>Expert Systems with Application >Sentiment analysis based on improved pre-trained word embeddings
【24h】

Sentiment analysis based on improved pre-trained word embeddings

机译:基于改进的预训练词嵌入的情感分析

获取原文
获取原文并翻译 | 示例

摘要

Sentiment analysis is a fast growing area of research in natural language processing (NLP) and text classifications. This technique has become an essential part of a wide range of applications including politics, business, advertising and marketing. There are various techniques for sentiment analysis, but recently word embeddings methods have been widely used in sentiment classification tasks. Word2Vec and GloVe are currently among the most accurate and usable word embedding methods which can convert words into meaningful vectors. However, these methods ignore sentiment information of texts and need a large corpus of texts for training and generating exact vectors. As a result, because of the small size of some corpora, researcher often have to use pre-trained word embeddings which were trained on other large text corpora such as Google News with about 100 billion words. The increasing accuracy of pre-trained word embeddings has a great impact on sentiment analysis research. In this paper, we propose a novel method, Improved Word Vectors (IWV), which increases the accuracy of pre-trained word embeddings in sentiment analysis. Our method is based on Part-of-Speech (POS) tagging techniques, lexicon-based approaches, word position algorithm and Word2Vec/GloVe methods. We tested the accuracy of our method via different deep learning models and benchmark sentiment datasets. Our experiment results show that Improved Word Vectors (IWV) are very effective for sentiment analysis. (C) 2018 Published by Elsevier Ltd.
机译:情感分析是自然语言处理(NLP)和文本分类研究的快速增长领域。该技术已成为包括政治,商业,广告和营销在内的广泛应用的重要组成部分。有多种用于情感分析的技术,但是近来词嵌入方法已被广泛用于情感分类任务中。 Word2Vec和GloVe当前是最准确和可用的词嵌入方法,可以将词转换为有意义的向量。但是,这些方法忽略了文本的情感信息,并且需要大量的文本库来训练和生成准确的向量。结果,由于某些语料库的规模小,研究人员经常不得不使用经过预训练的词嵌入,这些词嵌入是在其他大型文本语料库(如Google新闻)上训练的,该词法集约有1000亿个单词。预训练词嵌入的准确性不断提高,对情感分析研究产生了重要影响。在本文中,我们提出了一种新颖的方法,即改进的词向量(IWV),它可以提高情感分析中预训练词嵌入的准确性。我们的方法基于词性(POS)标记技术,基于词典的方法,单词位置算法和Word2Vec / GloVe方法。我们通过不同的深度学习模型和基准情感数据集测试了我们方法的准确性。我们的实验结果表明,改进的词向量(IWV)对于情感分析非常有效。 (C)2018由Elsevier Ltd.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号