【24h】

Link-Local Features for Hypertext Classification

机译:超文本分类的链接本地功能

获取原文
获取原文并翻译 | 示例

摘要

Previous work in hypertext classification has resulted in two principal approaches for incorporating information about the graph properties of the Web into the training of a classifier. The first approach uses the complete text of the neighboring pages, whereas the second approach uses only their class labels. In this paper, we argue that both approaches axe unsatisfactory: the first one brings in too much irrelevant information, while the second approach is too coarse by abstracting the entire page into a single class label. We argue that one needs to focus on relevant parts of predecessor pages, namely on the region in the neighborhood of the origin of an incoming link. To this end, we will investigate different ways for extracting such features, and compare several different techniques for using them in a text classifier.
机译:先前在超文本分类中的工作已经产生了两种主要方法,用于将有关Web的图形属性的信息合并到分类器的训练中。第一种方法使用相邻页面的完整文本,而第二种方法仅使用其类标签。在本文中,我们认为这两种方法都不能令人满意:第一种方法引入了太多不相关的信息,而第二种方法则过于粗糙,无法将整个页面抽象为一个类标签。我们认为,人们需要专注于前一页的相关部分,即,在传入链接的原点附近的区域。为此,我们将研究提取这些特征的不同方法,并比较在文本分类器中使用它们的几种不同技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号