首页> 外文会议>International Conference on Advances in Information Mining and Management >Applying of Sentiment Analysis for Texts in Russian Based on Machine Learning Approach
【24h】

Applying of Sentiment Analysis for Texts in Russian Based on Machine Learning Approach

机译:基于机器学习方法对俄语文本的情感分析应用

获取原文

摘要

This paper considers the problem of Sentiment classification in text messages in Russian with using Machine Learning methods - Naive Bayes classifier and the Support Vector Machine. One of the features of the Russian language is using of a wide variety of declensional endings depending on the declination, tenses, grammatical gender. Another common problem of sentiment classification for different languages is that different words can have the same meaning (synonyms) and thus give equal emotional value. Therefore, our task was to evaluate on how the lemmatization affects the sentiment classification accuracy (or another, with endings and without them), and to compare the results for Russian and English languages. For evaluating the impact of synonymy, we used the approach when the words with the same meaning are grouping into a single term. To solve these problems we used lemmatization and synonyms libraries. The results showed that using lemmatization for texts in Russian improves the accuracy of sentiment classification. On the contrary, the sentiment classification of texts in English without using lemmatization yields better result. The results also showed that the use synonymy in the model has a positive influence on accuracy. In the "Introduction", we describe a place Sentiment Analysis in Data Mining. In the "Approaches to the Sentiment Analysis", we tell about the main approaches of Sentiment Analysis: linguistic approach, an approach based on Machine Learning, and their combination. In the "Description of algorithms for Sentiment Analysis", we state the problem of sentiment classification and describe methods for solving it using a Naive Bayesian classifier, Bagging, Support Vector Machine. In the "Results of experiments", we describe aims of the experiment and the features of the implementation of the algorithm and report the results of the experiment. In the "Conclusion", we present the output from the results.
机译:本文考虑了使用机器学习方法的俄语文本消息中情绪分类问题 - 天真贝叶斯分类器和支持向量机。俄语的特征之一是根据倾斜,时态,语法性别来使用各种各样的跌幅结局。不同语言的情感分类的另一个常见问题是不同的单词可以具有相同的含义(同义词),从而给出相同的情绪值。因此,我们的任务是评估lemmatization如何影响情绪分类准确性(或其他,结束和没有它们),并比较俄语和英语的结果。为了评估同义词的影响,当具有相同含义的单词被分组为单个术语时,我们使用该方法。解决这些问题,我们使用了lemmatization和同义词库。结果表明,俄罗斯文本的lemmatization提高了情绪分类的准确性。相反,在不使用lemmatization的情况下,英语文本的情绪分类会产生更好的结果。结果还表明,该模型中的使用同义词对准确性具有积极的影响。在“介绍”中,我们描述了数据挖掘中的情感分析。在“情绪分析的方法”中,我们讲述了情绪分析的主要方法:语言方式,一种基于机器学习的方法及其组合。在“情绪分析的算法描述”中,我们说明了语言分类的问题,并描述了使用天真贝叶斯分类器解决方法的方法,袋装,支持向量机。在“实验结果”中,我们描述了实验的目的和算法实现的特征,并报告了实验结果。在“结论”中,我们从结果提出了产出。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号