...
首页> 外文期刊>Indian Journal of Computer Science and Engineering >WEKA FOR REDUCING HIGH - DIMENSIONAL BIG TEXT DATA
【24h】

WEKA FOR REDUCING HIGH - DIMENSIONAL BIG TEXT DATA

机译:WEKA用于减少高维大文本数据

获取原文
           

摘要

In the current era, data usually has a high volume, variety, velocity, and veracity, these are known as 4 V’s of Big Data. Social media is considered as one of the main causes of Big Data which get the 4 V’s of Big Data beside that it has high dimensionality. To manipulate Big Data efficiently; its dimensionality should be decreased. Reducing dimensionality converts the data with high dimensionality into an expressive representation of data with lower dimensions. This research work deals with efficient Dimension Reduction processes to reduce the original dimension aimed at improving the speed of data mining. Spam-WEKA dataset; which entails twitter user information. The modified J48 classifier is applied to reduce the dimension of the data thereby increasing the accuracy rate of data mining. The data mining tool WEKA is used as an API of MATLAB to generate the J48 classifiers. Experimental results indicated a significant improvement over the existing J48 algorithm.
机译:在当前时代,数据通常具有高容量,多样性,速度和准确性,这些被称为4 V大数据。社交媒体被认为是大数据的主要原因之一,除了具有高维度外,它还获得了大数据的4V。有效地操纵大数据;其尺寸应减小。降维可将高维数据转换为低维数据的表示形式。这项研究工作涉及有效的降维过程,以减少原始维,旨在提高数据挖掘的速度。垃圾邮件-WEKA数据集;这需要Twitter用户信息。改进的J48分类器用于减小数据的维数,从而提高数据挖掘的准确率。数据挖掘工具WEKA用作MATLAB的API来生成J48分类器。实验结果表明,与现有的J48算法相比,已有显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号