【24h】

Experiments on malay short text classification

机译:马来语短文本分类的实验

获取原文
获取外文期刊封面目录资料

摘要

In this study, experiments are conducted on Malay short text using three diverse types of classifiers: KNN, SVM and NB. The classifiers were used to test the features of a bag-of-words (BOW) and a variant of TF.IDF; TF-IDF, smoothed TF-IDF and ITC. A Malay short text dataset was developed based on tweets from Twitter data and classified into two separate classes. The experiments were conducted on 50 % and 20 % sizes of the test data. The results demonstrated that the most highly consistent result was achieved by the SVM classifier with ITC as the feature, where the Precision, Recall, and F1-Score were all achieved at 95 %.
机译:在这项研究中,使用三种不同类型的分类器进行实验在马来的短文本上进行:KNN,SVM和NB。分类器用于测试单词袋(弓)和TF.IDF的变体的特征; TF-IDF,平滑TF-IDF和ITC。基于Twitter数据的推文开发了马来的短文本数据集,并分为两个单独的类。实验在50 %和20 %尺寸的测试数据进行。结果表明,SVM分类器具有ITC作为特征的SVM分类器实现最高度一致的结果,其中精度,召回和F1分数全部以95 %实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号