首页> 外文会议>International Conference on Cyber and IT Service Management >The comparation of text mining with Naive Bayes classifier, nearest neighbor, and decision tree to detect Indonesian swear words on Twitter
【24h】

The comparation of text mining with Naive Bayes classifier, nearest neighbor, and decision tree to detect Indonesian swear words on Twitter

机译:文本挖掘与朴素贝叶斯分类器,最近邻居和决策树的比较,以检测Twitter上的印尼脏话

获取原文

摘要

Twitter is one of world most famous social media. There are many statement expresed in Twitter like happiness, sadness, public information, etc. Unfortunately, people may got angry to each other and write it down as a tweet on Twitter. Some tweet may contain Indonesian swear words. It's serious problem because many Indonesians may not tolerated swear words. Some Indonesian swear words may have multiple means, not always an Indonesian swear word means insulting. Twitter has provide tweet's data by account, trending topics, and advance keyword. This work try to analyze many tweet about political news, political event, and some Indonesian famous person because the tweet assumed contains many Indonesian swear word. The derived tweets will process in text mining and then analyzed by classification process using Naive Bayes Classifier, Nearest Neighbor, and Decision Tree to detect Indonesian swear word. This work expected to discover the high accurate classification model. It means, the model can differentiate the real meaning of Indonesian swear word contained in tweet.
机译:Twitter是世界上最著名的社交媒体之一。有许多说法在Twitter上expresed一样快乐,悲伤,公共信息等。不幸的是,人们可能很生气,彼此把它写下来,在Twitter上发文。一些推文中可能包含印度尼西亚的脏话。这是一个严重的问题,因为许多印尼人可能不会容忍说脏话。某些印度尼西亚咒骂词可能有多种含义,但并非总是印度尼西亚咒骂词意味着侮辱。 Twitter提供了按帐户,趋势主题和advance关键字提供的tweet数据。这项工作试图分析有关政治新闻,政治事件的一些推文,以及一些印尼名人,因为假定的推文包含许多印尼宣誓词。派生的推文将在文本挖掘中进行处理,然后使用朴素贝叶斯分类器,最近邻居和决策树通过分类过程进行分析,以检测印度尼西亚的脏话。这项工作有望发现高精度的分类模型。这意味着,该模型可以区分推文中所包含的印尼誓言词的真实含义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号