2-Way Text Classification for Harmful Web Documents

机译：有害网络文件的双向文本分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The openness of the Web allows any user to access almost any type of information. However, some information, such as adult content, is not appropriate for all users, notably children. Additionally for adults, some contents included in abnormal porn sites can do ordinary people’s mental health harm. In this paper, we propose an efficient 2-way text filter for blocking harmful web documents and also present a new criterion for clear classification. It filters off 0-grade web texts containing no harmful words using pattern matching with harmful words dictionaries, and classifies 1-grade,2-grade and 3-grade web texts using a machine learning algorithm.

机译：Web的开放性允许任何用户访问几乎任何类型的信息。但是，一些信息，例如成人内容，不适合所有用户，尤其是儿童。另外，对于成人来说，异常色情网站中包含的一些内容可以做普通的人们的心理健康危害。在本文中，我们提出了一种有效的2路文本滤波器，用于阻止有害的Web文件，并提出了一种清晰分类的新标准。它从包含与有害单词词典的模式匹配的模式匹配的0年级Web文本过滤，并使用机器学习算法对1年级，2级和3级Web文本进行分类。

著录项

来源
《International Conference on Computational Science and Its Applications》|2006年||共7页
会议地点
作者
Youngsoo Kim; Taekyong Nam; Dongho Won;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Text Filtering for Harmful Document Classification Using Three-Word Co-Occurrence and Large-Scale Data Processing [J] . TAKANOBU OTSUKA, DEYUE DENG, TAKAYUKI ITO Electronics and communications in Japan . 2015,第10期

机译：使用三字共现和大规模数据处理对有害文档分类进行文本过滤
2. A Novel Approach for Ontology- Based Feature Vector Generation for Web Text Document Classification [J] . Mohamed K. Elhadad, Khaled M. Badran, Gouda I. Salama International journal of software innovation . 2018,第1期

机译：基于本体的特征向量的Web文本文档分类新方法
3. A novel approach for ontology-based dimensionality reduction for web text document classification [J] . Elhadad Mohamed K., Badran Khaled Shafee S., Salama Gouda I. International journal of software innovation . 2017,第4期

机译：基于本体的Web文本文档分类降维的新方法
4. 2-Way Text Classification for Harmful Web Documents [C] . Youngsoo Kim, Taekyong Nam, Dongho Won International Conference on Computational Science and Its Applications(ICCSA 2006) pt.2; 20060508-11; Glasgow(GB) . 2006

机译：有害Web文档的2向文本分类
5. An investigation of several document classification algorithms leading to the design of an autonomous software agent for locating specific, relevant information on the World Wide Web. [D] . Lindal, John. 2001

机译：对几种文档分类算法的研究，导致设计了一种自治软件代理，用于在万维网上定位特定的相关信息。
6. WebMedline: Transforming Medline into a Hypertext Environment with Links to Full-Text Documents [O] . William M. Detmer, Edward H. Shortliffe 1996

机译：WebMedline：通过链接到全文文档将Medline转换为超文本环境
7. 2-way Text Classification for Harmful Web Documents ⋆ [O] . Youngsoo Kim, Taekyong Nam, Dongho Won 2008

机译：有害Web文档的双向文本分类⋆
8. Neural net learning issues in classification of free text documents [R] . Dasigi, V. R. , Mann, R. C. 1996

机译：自由文本文档分类中的神经网络学习问题

2-Way Text Classification for Harmful Web Documents

摘要

著录项

相似文献

相关主题

期刊订阅