首页> 外文会议>International Conference on Knowledge-Based Engineering and Innovation >Web pages classification: An effective approach based on text mining techniques
【24h】

Web pages classification: An effective approach based on text mining techniques

机译:网页分类:一种基于文本挖掘技术的有效方法

获取原文

摘要

Some web pages on Internet contain important content that are useful in a long time period or even forever. On the other hand, there are some web pages that are valuable only in a short time period. It is difficult to classify these types of web pages automatically due to their contents. This is an important task for improving the performance of search engines and web page recommender engines. In this project, webpages were classified into two categories with machine learning algorithms. For this purpose, natural language processing and text mining techniques were used for text pre-processing. Then appropriate information was extracted from texts and eventually web pages were classified by using machine learning algorithms. Compared to other approaches, most of the focus in this project is on text pre-processing stage and new strategies were presented to fill the gap. The results indicate that the proposed approach had better performance than other approaches.
机译:Internet上的某些网页包含重要的内容,这些内容在很长一段时间内甚至永远都是有用的。另一方面,有些网页仅在短时间内有价值。由于它们的内容,很难自动对这些类型的网页进行分类。这是提高搜索引擎和网页推荐引擎性能的一项重要任务。在该项目中,使用机器学习算法将网页分为两类。为此,自然语言处理和文本挖掘技术被用于文本预处理。然后从文本中提取适当的信息,并最终使用机器学习算法对网页进行分类。与其他方法相比,该项目的重点主要放在文本预处理阶段,并提出了填补空白的新策略。结果表明,所提出的方法具有比其他方法更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号