首页> 外文会议>2017 International Conference on Sustainable Information Engineering and Technology >Improving classification performance of public complaints with TF-IGM weighting: Case study : Media center E-wadul surabaya
【24h】

Improving classification performance of public complaints with TF-IGM weighting: Case study : Media center E-wadul surabaya

机译:TF-IGM权重改善公共投诉的分类性能:案例研究:媒体中心E-wadul surabaya

获取原文
获取原文并翻译 | 示例

摘要

Currently Media Center e-Wadul still uses manual labeling in the process of complaint submission. As a result, Media Center administration takes a long time in coordinating with regional work unit (SKPD) to respond to complaints registered. Therefore, it is necessary to classify complaints based on SKPD to speed up the timing of complaint submission. The challenge of classification using text data is to have a high dimension due to a large number of features. In addition, features that appear in almost all classes and even all classes and do not characterize a class are challenges in this research. The proposed term weighting is Term Frequency-Inverse Gravity Moment (TF-IGM). TF-IGM can calculate distinguishing class precisely of a term especially for multiclass problems in this study. The famous Term Frequency-Inverse-Document Frequency (TF-IDF) and TF-Binary weighting methods are also used as a comparison. The classification is performed on Support Vector Machine (SVM), Naive Bayes and K-Nearest Neighbor (KNN) algorithm. In this research, the incoming public complaints will be processed through the pre-process stage, term weighting stage, and classification stage. The classification performance using TF-IGM weighting on SVM method yielded the best value compared to others with accuracy, precision, recall and f-measure respectively 80.11%, 80.70%, 80.10%, and 80.20%.
机译:目前,Media Center e-Wadul在投诉提交过程中仍使用手动标签。因此,媒体中心管理部门需要很长时间与区域工作单位(SKPD)协调,以应对注册的投诉。因此,有必要基于SKPD对投诉进行分类,以加快提交投诉的时间。使用文本数据进行分类的挑战是由于具有大量功能而具有高维。另外,在几乎所有类别甚至所有类别中都出现且无法表征某个类别的特征是该研究的挑战。提议的项加权是项频率反重力矩(TF-IGM)。 TF-IGM可以精确计算一个术语的区分类,尤其是对于本研究中的多类问题。著名的术语频率反文档频率(TF-IDF)和TF二进制加权方法也用作比较。分类是在支持向量机(SVM),朴素贝叶斯和K最近邻(KNN)算法中执行的。在这项研究中,将通过预处理阶段,术语权重阶段和分类阶段来处理收到的公共投诉。与其他方法相比,使用TF-IGM加权的SVM方法进行分类的性能最高,准确度,精确度,召回率和f量度分别为80.11%,80.70%,80.10%和80.20%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号