首页> 外文会议> >Daily clustering for the electronic newspaper based on the analysis of trends
【24h】

Daily clustering for the electronic newspaper based on the analysis of trends

机译:基于趋势分析的电子报纸的每日聚类

获取原文

摘要

To classify newspaper articles automatically, the tf*idf method has been used to weight the words in an article. These methods are suitable for fixed databases, but cannot pick up the topic words of articles because the IDF methods give a low value for frequently occurring words. We propose the daily clustering method for electronic daily newspapers. Our method is based on the characteristics of articles and the change of contents. First, we define the weight function of words based on the position in the article and the change rate of content as time passes. Then we calculate the relation between articles, clustering value and the relation between clusters of different days. As a result of experiments, the accuracy of recall and precision rate improved by several percent compared with old methods.
机译:为了自动分类报纸文章,已使用tf * idf方法对文章中的单词进行加权。这些方法适用于固定数据库,但由于IDF方法对经常出现的单词的价值较低,因此无法选择文章的主题词。我们提出了电子日报的每日聚类方法。我们的方法基于文章的特征和内容的变化。首先,我们根据文章中的位置和内容随时间的变化率定义单词的权重函数。然后,我们计算了文章之间的关系,聚类值以及不同天数的聚类之间的关系。实验的结果是,召回的准确性和准确率与旧方法相比提高了百分之几。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号