首页> 外文会议>International Conference on Electrical Information and Communication Technology >Opinion Mining from Bangla and Phonetic Bangla Reviews Using Vectorization Methods
【24h】

Opinion Mining from Bangla and Phonetic Bangla Reviews Using Vectorization Methods

机译:使用矢量化方法从孟加拉语中挖掘民意和使用语音孟加拉语进行评论

获取原文
获取外文期刊封面目录资料

摘要

Opinion mining is the computational study of people's opinions, emotions and attitudes which is one of the key research field in Natural Language Processing (NLP). To cope with the competitive world, owners of business need to extract exact opinion of people about his/her business. Recently, people in Bangladesh are more interested to express their opinion in Bangla and most importantly in Phonetic Bangla rather than English. Since no specific work of Opinion mining introduced this criteria, in this paper, we have developed review analysis system on Bangla and Phonetic Bangla where we have used Restaurant reviews as case study and the dataset is created manually by us without using translator. Our approach starts by preprocessing raw data and then feature extraction with different N-gram techniques. Then vectorization is applied on that data with HashingVectorizer, CountVectorizer and TF-IDF vectorizer. Later machine learning based approaches namely Support Vector Machine (SVM), Decision Tree (DT) and Logistic Regression (LR) are applied to classify reviews. We have classified the reviews in three different classes, i.e. bad, good and excellent. Finally a comparison is shown between vectorizers in accordance with different classifiers where SVM provides better accuracy with 75.58%.
机译:意见挖掘是人们对意见,情感和态度的计算研究,是自然语言处理(NLP)的关键研究领域之一。为了应对竞争激烈的世界,企业主需要提取人们对其业务的确切意见。最近,孟加拉人对在孟加拉语(尤其是在语音孟加拉语)而不是英语中表达意见更加感兴趣。由于没有意见挖掘的具体工作介绍此标准,因此在本文中,我们在Bangla和Phonetic Bangla上开发了评论分析系统,其中我们将餐厅评论用作案例研究,并且数据集是由我们手动创建的,而无需使用翻译器。我们的方法首先对原始数据进行预处理,然后使用不同的N-gram技术进行特征提取。然后使用HashingVectorizer,CountVectorizer和TF-IDF矢量化器对该数据进行矢量化。后来基于机器学习的方法(即支持向量机(SVM),决策树(DT)和逻辑回归(LR))用于对评论进行分类。我们将评论分为三个不同的类别,即差,好和极好。最后,显示了根据不同分类器的矢量化器之间的比较,其中SVM以75.58%的精度提供了更高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号