Incorporating Lexical Knowledge via WordNet to Latent Dirichlet Allocation in Offensive Message Detection

Njagi Dennis Gitari; Zhang Zuping; Damien Hanyurwimfura; Jun Long

首页> 外文期刊>Journal of computational and theoretical nanoscience >Incorporating Lexical Knowledge via WordNet to Latent Dirichlet Allocation in Offensive Message Detection

【24h】

Incorporating Lexical Knowledge via WordNet to Latent Dirichlet Allocation in Offensive Message Detection

机译：通过Wordnet将词汇知识纳入攻击消息检测中的潜在Dirichlet分配

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We propose a model to offensive messages detection for political discourse that combines topic modeling and lexicon-based approaches for knowledge extraction. We develop an extension to the LDA suitable for offensive message detection by leveraging on lexical and semantic word features. Our model employs an externally supplied lexicon and WordNet, a lexical database, to incorporate prior knowledge to the LDA. At the document-level, we model the semantic relationship between a limited list of concepts with political orientation and corpus-determined themes. At the topic-level, we incorporate lexical word prior based on the WordNet lexical relationship between an externally supplied list of offensive words and topics generated from the corpus. Our model presumes a set of preselected labels that document themes should fit. We test our model against different sets of datasets and compare its performance against several baselines. The experiments confirm the effectiveness of our approach in both prediction and classification tasks.

机译：我们向攻击性消息检测提供了一个模型，用于政治话语，将主题建模和基于词汇的知识提取方法提出。我们通过利用词汇和语义词特征，开发适合进攻消息检测的LDA的扩展。我们的模型采用外部提供的Lexicon和Wordnet，一个词汇数据库，将先验知识合并到LDA。在文档级别，我们模拟了具有政治定位和决定主题的有限概念列表之间的语义关系。在主题级别，我们基于从语料库生成的外部提供的冒犯单词和主题列表之间的Wordnet词汇关系来结合了词汇字词。我们的模型假定一组预选标签，文档主题应该适合。我们对不同的数据集进行测试，并将其对若干基线进行比较。实验证实了我们在预测和分类任务中的方法的有效性。

著录项

来源
《Journal of computational and theoretical nanoscience》 |2016年第5期|共8页
作者
Njagi Dennis Gitari; Zhang Zuping; Damien Hanyurwimfura; Jun Long;
展开▼
作者单位

School of Information Science and Engineering Central South University Changsha 410083 China;

School of Information Science and Engineering Central South University Changsha 410083 China;

College of Science and Technology University of Rwanda 3900 Kigali Rwanda;

School of Information Science and Engineering Central South University Changsha 410083 China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类薄膜技术;
关键词
Offensive Message; Latent Dirichlet Allocation (LDA); WordNet; Lexical Prior;

机译：令人反感的信息;潜在的Dirichlet分配（LDA）;Wordnet;词汇;

相似文献

外文文献
中文文献
专利

1. Incorporating Lexical Knowledge via WordNet to Latent Dirichlet Allocation in Offensive Message Detection [J] . Njagi Dennis Gitari, Zhang Zuping, Damien Hanyurwimfura, Journal of computational and theoretical nanoscience . 2016,第5期

机译：通过Wordnet将词汇知识纳入攻击消息检测中的潜在Dirichlet分配
2. Unsupervised Scene Change Detection via Latent Dirichlet Allocation and Multivariate Alteration Detection [J] . Du Bo, Wang Yong, Wu Chen, Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of . 2018,第12期

机译：通过潜在狄利克雷分配和多元变更检测的无监督场景变化检测
3. Latent Dirichlet Allocation and POS Tags Based Method for External Plagiarism Detection: LDA and POS Tags Based Plagiarism Detection [J] . Naif Radi Aljohani, Jalal S. Alowibdi, Ali Daud, International journal on Semantic Web and information systems . 2018,第3期

机译：基于外部抄袭检测的基于潜在Dirichlet分配和POS标签：基于LDA和POS标签的抄袭检测
4. Using Latent Dirichlet Allocation to Incorporate Domain Knowledge For TopicTransition Detection [C] . Xiaodan Zhu, Xuming He, Cosmin Munteanu, International Speech Communication Association . 2008

机译：使用潜在的Dirichlet分配合并用于主题传输检测的域知识
5. Entity Relation Detection with Factorial Hidden Markov Models and Maximum Entropy Discriminant Latent Dirichlet Allocations . [D] . Li, Dingcheng. 2011

机译：因子隐马尔可夫模型与最大熵判别潜在Dirichlet分配的实体关系检测。
6. Latent Dirichlet allocation model for world trade analysis [O] . Diego Kozlowski, Viktoriya Semeshenko, Andrea Molinari 2021

机译：世界贸易分析潜在的Dirichlet分配模型
7. A Lexical Approach to Estimating Environmental Goods and Services Output in the Construction Sector via Soft Classification of Enterprise Activity Descriptions Using Latent Dirichlet Allocation [O] . Gerard Keogh 2019

机译：使用潜在Dirichlet分配通过企业活动描述软分类估算建筑行业环境商品和服务的词汇方法

Incorporating Lexical Knowledge via WordNet to Latent Dirichlet Allocation in Offensive Message Detection

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅