首页> 外文会议>Irish Signals and Systems Conference >Fake News Detection on Reddit Utilising CountVectorizer and Term Frequency-Inverse Document Frequency with Logistic Regression, MultinominalNB and Support Vector Machine

【24h】

Fake News Detection on Reddit Utilising CountVectorizer and Term Frequency-Inverse Document Frequency with Logistic Regression, MultinominalNB and Support Vector Machine

机译：利用CountVectorizer和术语频率反转文档频率与Logistic回归，MultimominNB和支持向量机的假新闻检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The distribution of misleading information or fake news has become a problem for society in recent times. In the world of social media, where anyone can share their opinions, beliefs and make it sound like these are fact, fake news becomes a threat to the reputation of companies and to people. In 2016, the USA Presidential elections gathered more attention from the generation of fake news articles, leading to a huge number of researchers and scientists to explore this Natural Language Processing research area with a sense of urgency and keen interest. However, investigation regarding what people are consuming from social media is in early stages and efforts are in progress to explore how people can separate disinformation from truthful content. The primary challenge in fake news detection is determining how to detect it. Supervised learning methods help us to detect these stories using labelled data to determine if text is real or fake. This research aims to develop and compare supervised learning models using Logistic Regression, MultinominalNB, and Support Vector Machine with CountVectorizer and Term Frequency -Inverse Document Frequency methods on Reddit data. The research concludes that the CountVectorizer and MultinominalNB model achieved highest accuracy on the Reddit dataset.

机译：误导信息或假新闻的分布已成为近时社会的问题。在社交媒体的世界中，任何人都可以分享他们的意见，信仰和让它听起来像这些事实，假新闻变成了公司和人民声誉的威胁。 2016年，美国总统选举从一代伪新闻文章中收集了更多的关注，导致大量的研究人员和科学家探讨了这种自然语言处理研究领域，具有紧迫感和敏锐的兴趣。但是，关于人们从社交媒体消费的调查是在早期阶段，努力正在探索人们如何将不属性与真实内容分开。假新闻检测中的主要挑战是确定如何检测它。监督学习方法有助于我们使用标记数据检测这些故事，以确定文本是否是真实的或假的。本研究旨在使用Logistic回归，MultinInalNB和支持向量机进行开发和比较具有CountVectorizer和术语频率 - 频率 - 在RedDIT数据上的频率 - 频率方法的监督学习模型。该研究的结论是，CountVectorizer和MultiMinalNB模型在Reddit DataSet上实现了最高精度。

著录项

来源
《Irish Signals and Systems Conference 》|2021年|1-6|共6页
会议地点
作者
Ankitkumar Patel; Kevin Meehan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Support vector machines; Social networking (online); Voting; Supervised learning; Companies; Natural language processing; Data models;

机译：支持向量机;社交网络（在线）;投票;监督学习;公司;自然语言处理;数据模型;

相似文献

外文文献
中文文献
专利

1. AUTOMATIC NEWS BLOG CLASSIFIER USING IMPROVED K-NEAREST NEIGHBOR AND TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY [J] . IRMA YUNITA, SENG HANSUN Journal of Theoretical and Applied Information Technology . 2019 ,第15期

机译：自动新闻博客分类器使用改进的k最近邻和术语频率 - 逆文档频率
2. SMS Spam Message Detection using Term Frequency-Inverse Document Frequency and Random Forest Algorithm [J] . Nilam Nur Amir Sjarif, Nurulhuda Firdaus Mohd Azmi, Suriayati Chuprat, Procedia Computer Science . 2019 ,第5期

机译：SMS垃圾邮件检测使用术语频率 - 逆文档频率和随机林算法
3. Early Detection of Gastroesophageal Reflux Disease Using Logistic Regression and Support Vector Machine [J] . Srividya B. V., Smitha Sasi International journal of organizational and collective intelligence . 2021 ,第2期

机译：利用Logistic回归和支持向量机的早期检测胃食管反流疾病
4. Term Frequency-Inverse Document Frequency Answer Categorization with Support Vector Machine on Automatic Short Essay Grading System with Latent Semantic Analysis for Japanese Language [C] . Anak Agung Putri Ratna, Aaliyah Kaltsum, Lea Santiar, International Conference on Electrical Engineering and Computer Science . 2019

机译：带有潜在语义分析的日语自动短文评分系统上的支持向量机词频逆文档频次答案分类
5. Machine Learning and Semantic Knowledge Assisted Fake News Detection Models [D] . Sabeeh, Vian Talal. 2020

机译：机器学习和语义知识辅助假新闻检测模型
6. Shallow Landslide Susceptibility Mapping: A Comparison between Logistic Model Tree Logistic Regression Naïve Bayes Tree Artificial Neural Network and Support Vector Machine Algorithms [O] . Viet-Ha Nhu, Ataollah Shirzadi, Himan Shahabi, 2020

机译：浅层滑坡敏感性图：逻辑模型树逻辑回归朴素贝叶斯树人工神经网络和支持向量机算法之间的比较
7. An Enhanced Hybrid Feature Selection Technique Using Term Frequency-Inverse Document Frequency and Support Vector Machine-Recursive Feature Elimination for Sentiment Classification [O] . Nur Syafiqah Mohd Nafis, Suryanti Awang 2021

机译：具有术语频率 - 逆文档频率的增强混合特征选择技术，并支持传染媒介机递归特征消除情绪分类

Fake News Detection on Reddit Utilising CountVectorizer and Term Frequency-Inverse Document Frequency with Logistic Regression, MultinominalNB and Support Vector Machine

摘要

著录项

相似文献

相关主题

期刊订阅