5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines

机译：5您应该知道的ClickBaits源！使用综合性ClickBaits来改进预测并区分机器人生成和人为的头条新闻

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clickbait is an attractive yet misleading headline that lures readers to commit click-conversion. Development of robust clickbait detection models has been, however, hampered due to the shortage of high-quality labeled training samples. To overcome this challenge, we investigate how to exploit human-written and machine-generated synthetic clickbaits. We first ask crowdworkers and journalism students to generate clickbaity news headlines. Second, we utilize deep generative models to generate clickbaity headlines. Through empirical evaluations, we demonstrate that synthetic clickbaits by human entities and deep generative models are consistently useful in improving the accuracy of various prediction models, by as much as 14.5% in AUC, across two real datasets and different types of algorithms. Especially, we observe an improvement in accuracy, up to 8.5% in AUC, even for top-ranked clickbait detectors from Clickbait Challenge 2017. Our study proposes a novel direction to address the shortage of labeled training data, one of fundamental bottlenecks in supervised learning, by means of synthetic training data with reinforced domain knowledge. It also provides a solution for distinguishing between bot-generated and human-written clickbaits, thus aiding the work of moderators and better alerting news consumers.

机译：ClickBait是一个有吸引力但误导性的标题，诱使读者提交点击转换。然而，由于高质量标记的训练样本的短缺，稳健的点击性检测模型的开发已经阻碍了。为了克服这一挑战，我们调查如何利用人性化和机器生成的合成ClickBAits。我们首先要求人群公司和新闻学生生成ClickBaity新闻头条新闻。其次，我们利用深生成的模型来生成ClickBaity头条新闻。通过经验评估，我们展示了人体实体和深度生成模型的合成ClickBaits在两个实际数据集和不同类型的算法中，通过提高各种预测模型的准确性，在AUC中的准确性和不同类型的算法中的多达14.5％。特别是，我们的准确性提高，AUC的准确性高达8.5％，即使是来自点击条目挑战的探测器，也是来自点击条款的探测器2017。我们的研究提出了一种新颖的方向来解决标记培训数据的短缺，是监督学习的基本瓶颈之一，通过具有加强域知识的合成训练数据。它还提供了区分机器人生成和人写的ClickBATS的解决方案，从而帮助主持人的工作和更好的警报新闻消费者。

著录项

来源
《IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining》|2019年|592p|共8页
会议地点
作者
Thai Le; Kai Shu; Maria D. Molina; Dongwon Lee; S. Shyam Sundar; Huan Liu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机网络;
关键词
Training; Predictive models; Social network services; Robustness; Media; Writing; Task analysis;

机译：培训;预测模型;社交网络服务;鲁棒性;媒体;写作;任务分析;

相似文献

外文文献
专利

1. CLICK-ID: A novel dataset for Indonesian clickbait headlines [J] . Andika William, Yunita Sari Data in Brief . 2020,第2期

机译：单击-id：印度尼西亚单击条标题的新型数据集
2. Ensemble Learning Approach for Clickbait Detection Using Article Headline Features [J] . Dilip Singh Sisodia Informing science: The international journal of an emerging transdiscipline . 2019,第5期

机译：使用文章标题功能进行单击诱饵检测的集成学习方法
3. Shocking secret you won't believe! Emotional arousal in clickbait headlines: An eye-tracking analysis [J] . Pengnate Supavich (Fone) Online Information Review . 2019,第7期

机译：您不会相信的令人震惊的秘密！
4. 5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines [C] . Thai Le, Kai Shu, Maria D. Molina, IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining . 2019

机译：您应该知道的5个Clickbaits来源！使用合成Clickbaits改进Bot生成的标题和人工撰写的标题之间的预测和区分
5. Improving Application Infrastructure Provisioning Using Resource Usage Predictions from Cloud Metric Data Analysis [D] . Hariharasubramanian, Mahesh. 2018

机译：使用来自云度量数据分析的资源使用预测来改进应用程序基础架构配置
6. CLICK-ID: A novel dataset for Indonesian clickbait headlines [O] . Andika William, Yunita Sari 2020

机译：单击-id：印度尼西亚单击条标题的新型数据集
7. Ensemble Learning Approach for Clickbait Detection Using Article Headline Features [O] . 2019

机译：使用文章标题特征的单击“单击侦测的集合学习方法
8. Improving Resource Selection and Scheduling Using Predictions [R] . Smith, Warren 2003

机译：利用预测改进资源选择和调度

5 Sources of Clickbaits You Should Know! Using Synthetic Clickbaits to Improve Prediction and Distinguish between Bot-Generated and Human-Written Headlines

摘要

著录项

相似文献

相关主题

期刊订阅