首页> 外文会议>International Conference on the Applications of Digital Information and Web Technologies >Combining content-based and context-based methods for Persian web page classification
【24h】

Combining content-based and context-based methods for Persian web page classification

机译:结合基于内容和基于上下文的方法的波斯网页分类

获取原文

摘要

As the Internet includes millions of web pages for each and every search query, a fast retrieving of the desired and related information from the Web becomes very challenging subject. Automatic classification of web pages into relevant categories is an important and effective way to deal with the difficulty of retrieving information from the Internet. There are many automatic classification methods and algorithms that have been propose for contentbased or context-based features of web pages. In this paper we analyze these features and try to exploit a combination of features to improve categorization accuracy of Persian web page classification. We conduct various experiments on a dataset consisting of 352 pages belonging to Persian Wekipedia, using content-based and context-based web page features. Our experiments demonstrate the usefulness of combining these features.
机译:由于互联网包括每个搜索查询的数百万个网页,因此来自网络的期望和相关信息的快速检索变得非常具有挑战性的主题。将网页自动分类为相关类别是一种重要而有效的方法,可以处理从互联网中检索信息的难度。有许多自动分类方法和算法已经提出了用于网页的基于基于基于群体的或基于上下文的功能。在本文中,我们分析了这些功能,并尝试利用特征的组合来提高波斯网页分类的分类准确性。我们使用基于内容和基于上下文的网页特征在于属于Persian Wekipedia的352页组成的数据集进行各种实验。我们的实验表明了结合这些特征的有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号