Clickbait detection using multiple categorisation techniques

Abinash Pujahari; Dilip Singh Sisodia

首页> 外文期刊>Journal of Information Science >Clickbait detection using multiple categorisation techniques

【24h】

Clickbait detection using multiple categorisation techniques

机译：单击使用多分类技术检测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempt visitors to click on a particular link either to monetise the landing page or to spread the false news for sensationalisation. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This article proposes a hybrid categorisation technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure and clustering. During preliminary categorisation, the headlines are separated using 11 features. After that, the headlines are recategorised using sentence formality and syntactic similarity measures. In the last phase, the headlines are again recategorised by applying clustering using word vector similarity based on t-stochastic neighbourhood embedding (t-SNE) approach. After categorisation of these headlines, machine learning models are applied to the dataset to evaluate machine learning algorithms. The obtained experimental results indicate that the proposed hybrid model is more robust, reliable and efficient than any individual categorisation techniques for the dataset we have used.

机译：ClickBaits是在线文章，故意设计的误导性标题，以便如何打开越来越多的读者来打开预期的网页。 ClickBaits用于诱使访问者单击特定链接，可以单击登陆页面或传播虚假新闻以获取敏感。任何新闻聚合器门户网站上的ClickBaits的存在可能会导致读者的不愉快的体验。自动检测来自新闻标题的ClickBait标题是机器学习界的一个具有挑战性的问题。已经提出了许多方法，以防止最近的单击条款。但是，最近在检测ClickBATIS中可用的技术并不多大。本文通过集成不同的特征，句子结构和群集来提出用于分离ClickBait和非点击条件的混合分类技术。在初步分类期间，使用11个功能分离头条新闻。之后，使用句子形式和句法相似度测量来重新制作头条新闻。在最后一个阶段，通过使用基于T-TocoChight邻域嵌入（T-SNE）方法的Word Vectory相似性应用聚类来再次通过应用聚类来重复分类。在这些标题分配后，将机器学习模型应用于数据集以评估机器学习算法。所获得的实验结果表明，所提出的混合模型比我们所使用的数据集的任何单独分类技术更强大，可靠和有效。

著录项

来源
《Journal of Information Science》 |2021年第1期|118-128|共11页
作者
Abinash Pujahari; Dilip Singh Sisodia;
展开▼
作者单位

Computer Science & Engineering National Institute of Technology Raipur India;

Computer Science & Engineering National Institute of Technology Raipur India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Classification; clickbait; clustering; sentence structure; word vector;

机译：分类;点击聚类;句子的结构;字向量;

相似文献

外文文献
中文文献
专利

1. An Effective Approach for Clickbait Detection Based on Supervised Machine Learning Technique [J] . Daoud M. Daoud, Samir Abou El-Seoud International journal of online engineering . 2019,第03期

机译：基于监督机器学习技术的点击诱饵有效检测方法
2. Categorising sports injuries in epidemiological studies: the subsequent injury categorisation (SIC) model to address multiple, recurrent and exacerbation of injuries [J] . Caroline F Finch, Jill Cook British journal of sports medicine . 2014,第17期

机译：在流行病学研究中对运动损伤进行分类：随后的损伤分类（SIC）模型，用于解决损伤的多发性，复发性和加重性
3. Analysis of Various Symbol Detection Techniques in Multiple-Input Multiple-Output System (MIMO) [J] . Shuchi Jani, Shrikrishan Yadav, B. L. Pal Advanced Computing: an International Journal . 2012,第2期

机译：多输入多输出系统（MIMO）中的各种符号检测技术分析
4. Thai Clickbait Detection Algorithms Using Natural Language Processing with Machine Learning Techniques [C] . Praphan Klairith, Sansiri Tanachutiwat 2018 International Conference on Engineering, Applied Sciences, and Technology . 2018

机译：使用自然语言处理和机器学习技术的泰式点击诱饵检测算法
5. Detection of Multiple Paternity in Diamondback Terrapin (Malaclemys terrapin) Egg Clutches from Charleston, SC Through the Use of Novel Molecular Techniques [D] . Sporre, Megan A. 2019

机译：通过使用新型分子技术，从南卡罗来纳州查尔斯顿的菱纹背龟（Malaclemys terrapin）卵离合器中检测多个父系
6. Categorising sports injuries in epidemiological studies: the subsequent injury categorisation (SIC) model to address multiple recurrent and exacerbation of injuries [O] . Caroline F Finch, Jill Cook -1

机译：在流行病学研究中对运动损伤进行分类：随后的损伤分类（SIC）模型以解决损伤的多发性复发性和加剧性
7. Multiuser Detection in Multiple Input Multiple Output Orthogonal Frequency Division Multiplexing Systems by Blind Signal Separation Techniques [O] . Du Yu 2012

机译：基于盲信号分离技术的多输入多输出正交频分复用系统多用户检测
8. Using Multiple Robust Parameter Design Techniques to Improve Hyperspectral Anomaly Detection Algorithm Performance [R] . Davis, M. 2009

机译：利用多种鲁棒参数设计技术提高高光谱异常检测算法性能

Clickbait detection using multiple categorisation techniques

摘要

著录项

相似文献

相关主题

期刊订阅