首页> 外文会议>International Conference on Knowledge-Based Engineering and Innovation >Web pages classification: An effective approach based on text mining techniques
【24h】

Web pages classification: An effective approach based on text mining techniques

机译:网页分类:基于文本挖掘技术的有效方法

获取原文

摘要

Some web pages on Internet contain important content that are useful in a long time period or even forever. On the other hand, there are some web pages that are valuable only in a short time period. It is difficult to classify these types of web pages automatically due to their contents. This is an important task for improving the performance of search engines and web page recommender engines. In this project, webpages were classified into two categories with machine learning algorithms. For this purpose, natural language processing and text mining techniques were used for text pre-processing. Then appropriate information was extracted from texts and eventually web pages were classified by using machine learning algorithms. Compared to other approaches, most of the focus in this project is on text pre-processing stage and new strategies were presented to fill the gap. The results indicate that the proposed approach had better performance than other approaches.
机译:Internet上的某些网页包含在长时间甚至永远的重要内容。另一方面,存在一些网页,只有在短时间内有价值。由于其内容,很难自动对这些类型的网页分类。这是提高搜索引擎和网页推荐引擎的性能的重要任务。在该项目中,网页分为两类机器学习算法。为此目的,自然语言处理和文本挖掘技术用于文本预处理。然后通过使用机器学习算法来提取来自文本的适当信息,最终将网页分类。与其他方法相比,该项目中的大部分重点都是文本预处理阶段,并提出了新的策略来填补差距。结果表明,该方法的性能比其他方法更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号