...
首页> 外文期刊>Journal of Information Science >Clickbait detection using multiple categorisation techniques
【24h】

Clickbait detection using multiple categorisation techniques

机译:单击使用多分类技术检测

获取原文
获取原文并翻译 | 示例
           

摘要

Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempt visitors to click on a particular link either to monetise the landing page or to spread the false news for sensationalisation. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This article proposes a hybrid categorisation technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure and clustering. During preliminary categorisation, the headlines are separated using 11 features. After that, the headlines are recategorised using sentence formality and syntactic similarity measures. In the last phase, the headlines are again recategorised by applying clustering using word vector similarity based on t-stochastic neighbourhood embedding (t-SNE) approach. After categorisation of these headlines, machine learning models are applied to the dataset to evaluate machine learning algorithms. The obtained experimental results indicate that the proposed hybrid model is more robust, reliable and efficient than any individual categorisation techniques for the dataset we have used.
机译:ClickBaits是在线文章,故意设计的误导性标题,以便如何打开越来越多的读者来打开预期的网页。 ClickBaits用于诱使访问者单击特定链接,可以单击登陆页面或传播虚假新闻以获取敏感。任何新闻聚合器门户网站上的ClickBaits的存在可能会导致读者的不愉快的体验。自动检测来自新闻标题的ClickBait标题是机器学习界的一个具有挑战性的问题。已经提出了许多方法,以防止最近的单击条款。但是,最近在检测ClickBATIS中可用的技术并不多大。本文通过集成不同的特征,句子结构和群集来提出用于分离ClickBait和非点击条件的混合分类技术。在初步分类期间,使用11个功能分离头条新闻。之后,使用句子形式和句法相似度测量来重新制作头条新闻。在最后一个阶段,通过使用基于T-TocoChight邻域嵌入(T-SNE)方法的Word Vectory相似性应用聚类来再次通过应用聚类来重复分类。在这些标题分配后,将机器学习模型应用于数据集以评估机器学习算法。所获得的实验结果表明,所提出的混合模型比我们所使用的数据集的任何单独分类技术更强大,可靠和有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号