Offensive Sentence Classification Using Character-Level CNN and Transfer Learning with Fake Sentences

机译：使用字符级CNN和用假句子传输学习的攻击性句子分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

There are two difficulties in classifying offensive sentences: One is the modifiability of offensive terms, and the other is the class imbalance which appears in general offensive corpus. Solving these problems, we propose a method of pre-training fake sentences generated as character-level to convolution layers preventing under-fitting from data shortage, and dealing with the data imbalance. We insert the offensive words to half of the randomly generated sentences, and train the convolution neural networks (CNN) with theses sentences and the labels of whether offensive word is included. We use the trained filter of CNN for training new CNN given original data, resulting in the increase of the amount of training data. We get higher F1-score with the proposed method than that without pre-training in three dataset of insult from kaggle, Bullying trace, and formspring.

机译：分类冒犯句子有两个困难：一个是进攻性的可修改性，另一个是普遍冒犯性语料库中出现的班级不平衡。解决这些问题，我们提出了一种预先训练作为字符级的假句子的方法，以阻止从数据短缺造成的卷积，并处理数据不平衡。我们将令人反感的单词插入到随机生成的句子的一半，并用这些句子训练卷积神经网络（CNN），以及是否包括冒犯词的标签。我们使用CNN的训练过滤器进行培训新的CNN给定原始数据，导致培训数据量的增加。我们以拟议的方法获得更高的F1分数，而不是在kaggle，欺凌跟踪和formspring中的三个数据集中进行预先训练。

著录项

来源
《International Conference on Neural Information Processing》|2017年|926p|共8页
会议地点
作者
Suin Seo; Sung-Bea Cho;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP183-53;
关键词
Text classification; Convolution neural networks Character-level model; Transfer learning;

机译：文本分类;卷积神经网络字符级模型;转移学习;

相似文献

外文文献
中文文献
专利

1. Sentimental Short Sentences Classification by Using CNN Deep Learning Model with Fine Tuned Word2Vec [J] . Amit Kumar Sharma, Sandeep Chaurasia, Devesh Kumar Srivastava Procedia Computer Science . 2020,第5期

机译：使用CNN深度学习模型使用微调Word2Vec的情感短句分类
2. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN [J] . Chen Tao, Xu Ruifeng, He Yulan, Expert Systems with Application . 2017,第APRa期

机译：使用BiLSTM-CRF和CNN通过句子类型分类改善情感分析
3. A CNN-LSTM network with attention approach for learning universal sentence representation in embedded system [J] . Fu Qunchao, Wang Cong, Han Xu Microprocessors and microsystems . 2020,第Apra期

机译：一种CNN-LSTM网络，具有在嵌入式系统中学习普遍句子表示的注意方法
4. Offensive Sentence Classification Using Character-Level CNN and Transfer Learning with Fake Sentences [C] . Suin Seo, Sung-Bea Cho International conference on neural information processing . 2017

机译：使用字符级CNN的进攻性句子分类和带有伪造句的转移学习
5. Race, Ethnicity, Threat, and the Sentencing of Transferred Juveniles in Florida Criminal Courts [D] . Lehmann, Peter S. 2019

机译：种族，种族，威胁和佛罗里达州刑事法院对未成年人的判刑
6. A Sentence-Level Joint Relation Classification Model Based on Reinforcement Learning [O] . Zhen Liu, XiaoQiang Di, Wei Song, 2021

机译：基于强化学习的句子联合关系分类模型
7. Do Sentence Interactions Matter? Leveraging Sentence Level Representations for Fake News Classification [O] . Vaibhav Vaibhav, Raghuram Mandyam, Eduard Hovy 2019

机译：句子互动吗？利用假新闻分类的句子级别表示

Offensive Sentence Classification Using Character-Level CNN and Transfer Learning with Fake Sentences

摘要

著录项

相似文献

相关主题

期刊订阅