首页> 外文会议>International Conference on Knowledge Discovery and Information Retrieval >Combining Clustering and Classification Approaches for Reducing the Effort of Automatic Tweets Classification
【24h】

Combining Clustering and Classification Approaches for Reducing the Effort of Automatic Tweets Classification

机译:结合聚类和分类方法来减少自动推文分类的努力

获取原文

摘要

The classification problem has got a new importance dimension with the growing aggregated value which has been given to the Social Media such as Twitter. The huge number of small documents to be organized into subjects is challenging the previous resources and techniques that have been using so far. Futhermore, today more than ever, personalization is the most important feature that a system needs to exhibit. The goal of many online systems, which are available in many areas, is to address the needs or desires of each individual user. To achieve this goal, these systems need to be more flexible and faster in order to adapt to the user's needs. In this work, we explore a variety of techniques with the aim of better classify a large Twitter data set accordingly to a user goal. We propose a methodology where we cascade an unsupervised following by supervised technique. For the unsupervised technique we use standard clustering algorithms, and for the supervised technique we propose the use of a kNN algorithm and a Centroid Based Classifier to perform the experiments. The results are promising because we reduced the amount of work to be done by the specialists and, in addition, we were able to mimic the human assessment decisions 0.7907 of the time, according to the F1-measure.
机译:分类问题具有新的重要性维度,其增长的聚合值已被提供给社交媒体,如Twitter。迄今为止,要组织成科目的大量小文件是挑战以前使用的资源和技术。今年以来,更重要的是,个性化是系统需要展出的最重要的特征。许多在线系统的目标是在许多领域提供的,是解决每个用户的需求或期望。为了实现这一目标,这些系统需要更加灵活,更快,以适应用户的需求。在这项工作中,我们探索了各种技术,其目的是更好地对用户目标进行相应地设置的大型推特数据。我们提出了一种方法,我们通过监督技术级联无人驾驶的遵循。对于未经监督的技术,我们使用标准聚类算法,并且对于监督技术,我们提出了使用KNN算法和基于质心的分类器来执行实验。结果令我们缩短了专家的工作量,并根据F1措施减少了专家的工作量减少了待完成的工作量,我们能够模仿0.7907的人类评估决策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号