首页> 外文会议>IEEE International Conference on Systems, Man, and Cybernetics >Centralized content-based Web filtering and blocking: how far can it go?
【24h】

Centralized content-based Web filtering and blocking: how far can it go?

机译:基于内容的基于内容的Web过滤和阻塞:它可以走多远?

获取原文

摘要

To an organisation, centralized Internet filtering and blocking is very important. Educators and parents would like to block offensive materials from children. Companies also want to reduce the amount of work time that employees spend on non-productive Web surfing. Current blocking and filtering mechanisms can roughly be classified into two approaches: URL-based and content filtering. In the URL-based approach, a requested URL address is blocked if a match is found in the blocked list. However, keeping the list up-to-date is very difficult. In the content filtering approach, keyword matching is often used. Its main problem is mis-blocking. Many desirable Web sites are blocked because some predefined keywords appear in their Web pages, though in different meaning or context. There are suggestions for image, audio and video understanding in real-time content filtering. The delay time is also of great concern. In this paper, we investigate how far multimedia content analysis should go for Internet filtering and blocking. A set of guidelines for defining the heuristics used in real-time Web content analysis is also given. These heuristics not only have higher filtering accuracy than most multimedia retrieval techniques do, but they also have a comparable runtime overhead to that of keyword matching. Our experience of deploying a pornographic filtering system in high schools is also described. Experience from the system's implementation and deployment is found to give a very good direction to the centralized filtering and blocking of Web content.
机译:对于组织,集中式互联网过滤和阻塞非常重要。教育工作者和父母希望阻止来自儿童的攻击性材料。公司还希望减少员工在非生产网络冲浪上花费的工作时间。电流阻塞和过滤机制可以大致分为两种方法:基于URL和内容过滤。在基于URL的方法中,如果在阻止列表中找到匹配,则会阻止请求的URL地址。但是,保持列表最新是非常困难的。在内容过滤方法中,通常使用关键字匹配。它的主要问题是错误阻碍。许多所需的Web站点被阻止,因为某些预定义关键字出现在其网页中,但在不同的含义或上下文中。实时内容过滤中有图像,音频和视频理解有建议。延迟时间也非常关注。在本文中,我们调查了互联网过滤和阻塞的多媒体内容分析多远。还给出了一组用于定义实时Web内容分析中使用的启发式的准则。这些启发式的滤波器精度不仅具有比大多数多媒体检索技术所做的更高的滤波精度,而且它们还具有与关键字匹配的相似的运行时开销。还描述了我们在高中部署色情过滤系统的经验。发现系统实现和部署的经验,为集中过滤和阻塞Web内容提供了非常好的方向。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号