首页> 外文期刊>Journal of software >An Improved Approach to Term Weighting in Hierarchical Web Page Classification
【24h】

An Improved Approach to Term Weighting in Hierarchical Web Page Classification

机译:网页分级中术语加权的一种改进方法

获取原文
           

摘要

Currently, in web page classification, Absolute Weighting Method is a common method to weight HTML main structure features. The disadvantage of the method is that weighting coefficient is a fixed value, which has different effects on the long and short text. So the influence of structure features on local text will be weakened with the length of local text increasing. To solve the problem, we propose an improved weighting method, namely Relative Weighting Method. In the experiment of web page hierarchical classification, we compare the two methods’ classification performance on a single label and several labels combination. The results show that Relative Weighting Method can effectively improve the classification accuracy, which is better than the Absolute Weighting Method.
机译:当前,在网页分类中,绝对加权法是对HTML主要结构特征进行加权的常用方法。该方法的缺点是加权系数是固定值,这对长文本和短文本具有不同的影响。因此,结构特征对本地文本的影响会随着本地文本长度的增加而减弱。为了解决该问题,我们提出了一种改进的加权方法,即相对加权方法。在网页分层分类的实验中,我们比较了两种方法在单个标签和多个标签组合上的分类性能。结果表明,相对加权法可以有效地提高分类精度,优于绝对加权法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号