首页> 外文期刊>American journal of engineering and applied sciences >English Sentiment Classification using Only the Sentiment Lexicons with a JOHNSON Coefficient in a Parallel Network Environment
【24h】

English Sentiment Classification using Only the Sentiment Lexicons with a JOHNSON Coefficient in a Parallel Network Environment

机译:并行网络环境中仅使用具有约翰逊系数的情感词典对英语情感分类

获取原文
           

摘要

Sentiment classification is significant in everyday life, such as in political activities, commodity production and commercial activities. In this survey, we have proposed a new model for Big Data sentiment classification. We use many sentiment lexicons of our basis English Sentiment Dictionary (bESD) to classify 5,000,000 documents including 2,500,000 positive and 2,500,000 negative of our testing data set in English. We do not use any training data set in English. We do not use any one-dimensional vector in both a sequential environment and a distributed network system. We also do not use any multi-dimensional vector in both a sequential system and a parallel network environment. We use a JOHNSON Coefficient (JC) through a Google search engine with AND operator and OR operator to identify many sentiment values of the sentiment lexicons of the bESD in English. One term (a word or a phrase in English) is clustered into either the positive polarity or the negative polarity if this term is very close to either the positive or the negative by using many similarity measures of the JC. It means that this term is very similar to either the positive or the negative. We tested the proposed model in both a sequential environment and a distributed network system. We achieved 87.56% accuracy of the testing data set. The execution time of the model in the parallel network environment is faster than the execution time of the model in the sequential system. Our new model can classify sentiment of millions of English documents based on the sentiment lexicons of the bESD in a parallel network environment. The proposed model is not depending on both any special domain and any training stage. This survey used many similarity coefficients of a data mining field. The results of this work can be widely used in applications and research of the English sentiment classification.
机译:情感分类在日常生活中很重要,例如在政治活动,商品生产和商业活动中。在这项调查中,我们提出了大数据情感分类的新模型。我们使用基础英语情感词典(bESD)的许多情感词典来对5,000,000个文档进行分类,其中包括英语测试数据集的2,500,000正和2,500,000负。我们不使用任何英语培训数据集。在顺序环境和分布式网络系统中,我们都不会使用任何一维向量。在顺序系统和并行网络环境中,我们也不使用任何多维向量。我们通过带有AND运算符和OR运算符的Google搜索引擎使用JOHNSON系数(JC),以英语识别bESD情感词典的许多情感值。如果一个术语(英语中的单词或短语)通过使用JC的许多相似性度量而非常接近于正极或负极,则可以将其归为正极性或负极性。这意味着该术语与肯定或否定非常相似。我们在顺序环境和分布式网络系统中测试了提出的模型。我们达到了87.56%的测试数据集准确性。并行网络环境中模型的执行时间比顺序系统中模型的执行时间快。我们的新模型可以在并行网络环境中基于bESD的情感词典对数百万个英语文档的情感进行分类。提出的模型既不依赖于任何特殊领域,也不依赖于任何培训阶段。这项调查使用了数据挖掘领域的许多相似系数。这项工作的结果可广泛用于英语情感分类的应用和研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号