首页> 外文会议>World Congress on Engineering and Computer Science >VSMs with K-Nearest Neighbour to Categorise Arabic Text Data
【24h】

VSMs with K-Nearest Neighbour to Categorise Arabic Text Data

机译:VSM与K-Collect邻居分类阿拉伯文本数据

获取原文

摘要

Text categorisation is a popular problem that has been studied extensively in the last four decades. This paper investigates different variations of vector space models (VSMs) and term weighting approaches using KNN algorithm. The base of our comparison in the experiments we conduct is the F1 evaluation measure. The Experimental results against different Arabic text categorisation data sets provide evidence that Dice and Jaccard Coefficient outperform the Cosine Coefficient approach with regards to F1 results, and the Dice-based TF.IDF achieves the highest average scores.
机译:文本分类是一个流行的问题,这在过去的四十年中已经过广泛研究过。本文研究了使用KNN算法的矢量空间模型(VSM)和术语加权方法的不同变化。我们在实验中的比较基础是F1评估措施。针对不同阿拉伯文​​文本分类数据集的实验结果提供了骰子和Jaccard系数优于余弦系数方法的证据,关于F1结果,基于骰子的TF.IDF实现了最高的平均分子。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号