首页> 外文期刊>International Journal of Computer Systems Science & Engineering >Sentiment Analysis System in Big Data Environment
【24h】

Sentiment Analysis System in Big Data Environment

机译:大数据环境下的情感分析系统

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays, Big Data, a large volume of both structured and unstructured data, is generated from Social Media. Social Media are powerful marketing tools and social big data can offer the business insights. The major challenge facing social big data is attaining efficient techniques to collect a large volume of social data and extract insights from the huge amount of collected data. Sentiment Analysis of social big data can provide business insights by extracting the public opinions. The traditional analytic platforms need to be scaled up for analyzing a large volume of social big data. Social data are by nature shorter and generally not constructed with proper grammatical rules and hence difficult to achieve high reliable result in Sentiment Analysis. Acquiring effective training data is a challenge, although learning based approaches are good for sentiment classification. Manual Labeling for training data is time and labor consuming. In this paper, Sentiment Analysis system on Big Data Analytics platform is proposed to provide valuable information by analyzing large scale social data in an efficient and timely manner since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The proposed Sentiment Analysis system consists of four modules: data collection, data cleaning and preprocessing, class labeling and sentiment classification. The system enables high-level performance of sentiment classification while taking advantage of combining lexicon-based classifier's effortless setup process and learning based classifier. Twitter stream data is used for system evaluation as the Twitter is widespread Social Media and a good source of information in the sense of snapshots of moods and feelings as well as up-to-date events. The evaluation results show that this system achieve a promising accuracy by 84.2%. Moreover, this system is able to scale up to analyze the large scale data by decreasing the processing time when adding more nodes in the cluster
机译:如今,大数据是社交媒体生成的大量结构化和非结构化数据。社交媒体是功能强大的营销工具,社交大数据可以提供业务见解。社会大数据面临的主要挑战是获得有效的技术来收集大量社会数据并从大量收集的数据中提取见解。社会大数据的情感分析可以通过提取公众意见来提供业务见解。需要扩展传统的分析平台来分析大量的社会大数据。社会数据本质上较短,通常没有适当的语法规则构建,因此很难在情感分析中获得高度可靠的结果。尽管基于学习的方法对情感分类有好处,但是获得有效的训练数据是一个挑战。手动标记培训数据既费时又费力。本文提出了一种基于大数据分析平台的情感分析系统,该系统通过使用MapReduce框架和Hadoop分布式存储(HDFS)来实现,从而可以通过高效,及时的方式分析大规模社交数据来提供有价值的信息。拟议的情感分析系统包括四个模块:数据收集,数据清理和预处理,类别标签和情感分类。该系统可将情感分类的高级性能实现,同时充分利用基于词典的分类器的轻松设置过程和基于学习的分类器的优势。 Twitter流数据用于系统评估,因为Twitter是广泛使用的社交媒体,并且是情绪和快照快照以及最新事件的良好信息来源。评估结果表明,该系统的准确率达到了84.2%。此外,该系统能够通过在集群中添加更多节点时减少处理时间来扩展规模,以分析大型数据

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号