首页> 外文会议>International Conference on Information and Communication Technology >Automatic Tweet Classification Based on News Category in Indonesian Language
【24h】

Automatic Tweet Classification Based on News Category in Indonesian Language

机译:基于印尼语新闻类别的自动推文分类

获取原文

摘要

Tweet is being informative as well as news articles, so that the automatic tweet classifier based on news category could be useful to make ease in searching tweet based on certain interesting category. We identified those are 11 categories: religion, business, entertainment, law and crime, health, motivation, sport, government, education, politics and technology. In the learning process, we use ZeroR, Naive Bayes Multinomial (NBM), Support Vector Machine (SVM), Random Forest (RF) and Sequential Minimal Optimization (SMO) algorithm based on previous work that has similar topic with this paper. In experiments, we experiment classifier using all tweet and various maximum number of tweets and terms in each category. In evaluating performance system, we used 10-fold cross validation and use accuracy (correctly classified instances) as performance parameter. In the experiments result, NBM performs the highest performance with 77,47% accuracy with maximum number of tweets and terms in every category is 500 tweets and 1000 terms. At the last, we built automatic tweet classifier with NBM due to this classifier and experiment result perform the best performances using web-based programming.
机译:Tweet不仅具有新闻性,还具有新闻性,因此基于新闻类别的自动Tweet分类器可能有助于简化基于某些有趣类别的Tweet的搜索。我们确定了11个类别:宗教,商业,娱乐,法律和犯罪,健康,动机,体育,政府,教育,政治和技术。在学习过程中,我们基于先前的工作,使用ZeroR,朴素贝叶斯多项式(NBM),支持向量机(SVM),随机森林(RF)和顺序最小优化(SMO)算法,与本文的主题相似。在实验中,我们使用所有推文以及每个类别中各种最大数量的推文和术语进行实验分类。在评估绩效系统时,我们使用了10倍交叉验证,并使用准确性(正确分类的实例)作为绩效参数。在实验结果中,NBM的性能最高,准确度为77.47%,推文数量最多,每个类别中的术语为500条推文和1000个术语。最后,由于该分类器,我们使用NBM构建了自动推文分类器,并且实验结果使用基于Web的编程实现了最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号