A significant application of sentiment analysis is to determine the user's semantic orientation in product reviews which are generally short texts.Traditional methods often acquire the shallow characteristics of words for sentiment analysis through bag-of-words model.However, the model trained through these simple characteristics doesn't have a good performance in short text, especially complex syntax context.Through using deep recursive neural network to capture the semantic information and introducing a Chinese sentiment training treebank as the training set to find the sentiment information, a relatively higher accuracy on five-class short text sentiment analysis is achieved.Aiming at the problem of training time efficiency in large scale data, the parallelization is implemented through Spark, which can enhance the scalability and time efficiency of the model.%情感分析的一个重要应用是判断用户对于产品评论的情感倾向,这些用户评论一般都是字数较少的短文本.传统方法多利用词袋模型获取单词的浅层特征来进行情感分析,利用这些简单特征训练的模型在短文本,尤其是在复杂语法问题上效果并不理想.通过利用深度递归神经网络算法来捕获句子语义信息,并引入中文"情感训练树库"作为训练数据来发现词语情感信息,在短文本情感五分类的问题上取得了较高的准确率.针对复杂模型在海量数据训练上的时间效率问题,通过在Spark并行框架下实现了模型的并行化处理,使得模型的可扩展性和时间效率得到提升.
展开▼