Text Classification Models for Web Content Filtering and Online Safety

机译：Web内容过滤和在线安全的文本分类模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Living in an era of anywhere anytime connectedness for the great mass, safety and security on the web presents enormous challenges. There is a great need for better content detection systems that can more accurately identify excessively offensive and harmful websites. Web classification models in the early days are limited by the methods and data available. Today advanced developments in computing methodologies and technology have brought us many new and better means for text content analysis, for example new methods for topic extraction, topic modeling and sentiment analysis. Our recent studies suggested the promising potential of combing topic analysis and sentiment analysis in web content classification. This paper further explores new classification models for better classification performance, especially to enhance precision and reduce false positives, by incorporation of semantics in developing classification models and by examination and handling of the issues with the dataset reliability, class imbalance and covariate shift.

机译：生活在一个随时随地的时代，网络上的庞大，安全性和安全性提出了巨大的挑战。迫切需要一种更好的内容检测系统，它可以更准确地识别过度冒犯性和有害的网站。早期的Web分类模型受到可用方法和数据的限制。如今，计算方法和技术的先进发展为我们带来了许多新的更好的文本内容分析方法，例如用于主题提取，主题建模和情感分析的新方法。我们最近的研究表明，在网页内容分类中结合主题分析和情感分析具有广阔的潜力。本文进一步探索了新的分类模型，以通过在开发分类模型中纳入语义并检查和处理与数据集可靠性，类不平衡和协变量偏移有关的问题，从而更好地实现分类性能，特别是提高精度并减少误报。

著录项

来源
《IEEE International Conference on Data Mining Workshops》|2015年|961-968|共8页
会议地点
作者
Shuhua Liu; Thomas Forss;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Web content classification; imbalanced classes; online safety solutions; sentiment analysis; topic extraction; topic similarity;

机译：网站内容分类;不平衡类;在线安全解决方案;情感分析;主题提取;主题相似度;

相似文献

外文文献
中文文献
专利

1. Index-based Online Text Classification for SMS Spam Filtering [J] . Wuying Liu, Ting Wang Journal of Computers . 2010,第6期

机译：SMS垃圾邮件过滤的基于索引的在线文本分类
2. A method for managing access to web pages: Filtering by Statistical Classification (FSC) applied to text [J] . Jonathan P. Caulkins, Wenxuan Ding, George Duncan, Decision support systems . 2006,第1期

机译：一种管理访问网页的方法：应用于文本的按统计分类（FSC）过滤
3. Online Biterm Topic Model based short text stream classification using short text expansion and concept drifting detection [J] . Hu Xuegang, Wang Haiyan, Li Peipei Pattern recognition letters . 2018,第DECa1期

机译：使用短文本扩展和概念漂移检测的基于在线Biterm主题模型的短文本流分类
4. Text Classification Models for Web Content Filtering and Online Safety [C] . Shuhua Liu, Thomas Forss IEEE International Conference on Data Mining Workshops . 2015

机译：用于Web内容过滤和在线安全的文本分类模型
5. A Content Analysis of Women's Safety Websites: Rape Myths and the Internet [D] . Tzotzes, Kamille. 2012

机译：妇女安全网站的内容分析：强奸神话和互联网
6. Child safety education and the world wide web: an evaluation of the content and quality of online resources [O] . D Isaac, M Cusimano, A Sherman, 2004

机译：儿童安全教育和万维网：对在线资源的内容和质量的评估
7. Web page multi-label classification for filtering content from the web [O] . Hresko, Juraj 2012

机译：网页多标签分类，用于从Web过滤内容

Text Classification Models for Web Content Filtering and Online Safety

摘要

著录项

相似文献

相关主题

期刊订阅