首页> 外文会议>International Conference on Electronics, Information and Communication Engineering >AN IMPROVED APPROACH TO TERM WEIGHTING IN HIERARCHICAL WEB PAGE CLASSIFICATION
【24h】

AN IMPROVED APPROACH TO TERM WEIGHTING IN HIERARCHICAL WEB PAGE CLASSIFICATION

机译:分层网页分类中的术语加权的改进方法

获取原文

摘要

Currently, in web page classification, absolute weighting method is a common method to weight HTML main structure features. The disadvantage of the method is that weighting coefficient is a fixed value, which has different effect on the long and short text. So the influence of structure features on local text will be weakened with the length of local text increasing. To solve the problem, we propose an improved weighting method, namely relative weighting method. In the experiment of web page hierarchical classification, we compare the two methods' classification performance on a single label and several labels combination. The results show that relative weighting method can effectively improve the classification accuracy, which is better than the absolute weighting method.
机译:目前,在网页分类中,绝对加权方法是重量HTML主要结构特征的常见方法。方法的缺点是加权系数是固定值,其对长短文本具有不同的效果。因此,本地文本上的结构特征的影响将削弱本地文本的长度。为了解决问题,我们提出了一种改进的加权方法,即相对加权方法。在网页分层分类的实验中,我们将两种方法的分类性能进行比较,以及几个标签组合。结果表明,相对加权方法可以有效提高分类精度,这比绝对加权方法更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号