Data Mining Technology Application in False Text Information Recognition

Jie Wan; Xue Cao; Kun Yao; Donghui Yang; E. Peng; Yong Cao

首页> 外文期刊>Mobile information systems >Data Mining Technology Application in False Text Information Recognition

【24h】

Data Mining Technology Application in False Text Information Recognition

机译：数据挖掘技术应用于虚假文本信息识别

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

False information on the Internet is being heralded as serious social harm to our society. To recognize false text information, in this paper, an effective method for mining text features is proposed in the field of false drug advertisements. Firstly, the data of false drug advertisements and real drug advertisements were collected from the official websites to build a database of false and real drug advertisements. Secondly, by performing feature extraction on the text of drug advertisements, this work built a characteristic matrix based on the effective features and assigned positive or negative labels to the feature vector of the matrix according to whether it is a fake medical advertisement or not. Thirdly, this study trained and tested several different classifiers, selected the classification model with the best performance in identifying false drug advertisements, and found the key characteristics that can determine the classification. Finally, the model with the best performance was used to predict new false drug advertisements collected from Sina Weibo. In the case of identifying false drug advertisements, the classification effect of the support vector machine (SVM) classifier established on the feature set after feature selection was the most effective. The findings of this study can provide an effective method for the government to identify and combat false advertisements. This study has a certain reference significance in demonstrating the use of text data mining technology to identify and detect information fraud behavior.

机译：关于互联网的虚假信息被视为对我们社会的严重社会危害。为了识别错误的文本信息，在本文中，在虚假药广告领域提出了一种有效的采矿文本特征方法。首先，从官方网站中收集了虚假药广告和真实药物广告的数据，以建立虚假和真实的药物广告数据库。其次，通过对药物广告的文本执行特征提取，这项工作基于有效特征，并根据是否是假医学广告，基于有效特征，并将正面或负标签分配给矩阵的特征向量。第三，这项研究训练并测试了几种不同的分类器，选择了识别虚假广告的最佳性能的分类模型，并发现可以确定分类的关键特性。最后，使用最佳性能的模型用于预测从新浪微博收集的新虚假广告。在识别虚假药物广告的情况下，在特征选择之后在特征集上建立的支持向量机（SVM）分类器的分类效果是最有效的。本研究的调查结果可以为政府提供有效的方法来识别和打击虚假广告。本研究具有一定的参考意义，在说明使用文本数据挖掘技术来识别和检测信息欺诈行为。

著录项

来源
《Mobile information systems》 |2021年第a期|共13页
作者
Jie Wan; Xue Cao; Kun Yao; Donghui Yang; E. Peng; Yong Cao;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Practical text mining and statistical analysis for non-structured text data applications [J] . Radu State Computing reviews . 2014,第9期

机译：适用于非结构化文本数据应用程序的实用文本挖掘和统计分析
2. Text mining: beyond search technology -- the rise of text mining is taking search engine capabilities to the next level. here's how to reap the advantages while avoiding the pitfalls of implementing this new technology within your organization [J] . Patricia Soto DB2 magazine: Strategies & Solutions for the Database Professional . 1998,第3期

机译：文本挖掘：超越搜索技术-文本挖掘的兴起将搜索引擎功能提升到一个新的水平。这是在不降低在组织内实施此新技术的陷阱的同时获得优势的方法
3. Text mining: beyond search technology -- the rise of text mining is taking search engine capabilities to the next level. here's how to reap the advantages while avoiding the pitfalls of implementing this new technology within your organization [J] . Patricia Soto DB2 magazine: Strategies & Solutions for the Database Professional . 1998,第3期

机译：文本挖掘：超越搜索技术-文本挖掘的兴起将搜索引擎功能提升到一个新的水平。这是在不降低在组织内实施此新技术的陷阱的同时获得优势的方法
4. Application of Big Data and text mining methods and technologies in modern business analyzing social networks data about traffic tracking [C] . Emir Žunić, Almir Djedović, Dženana Đonko . 2016

机译：大数据和文本挖掘方法与技术在现代企业分析流量跟踪社交网络数据中的应用
5. A novel data mining methodology for narrative text mining and its application in MSHA accident, injury and illness database. [D] . Yang, Xiaoli. 2011

机译：一种新颖的叙事文本挖掘数据挖掘方法及其在MSHA事故，伤害和疾病数据库中的应用。
6. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery [O] . Graciela H. Gonzalez, *, Tasnia Tahsin, -1

机译：用于生物医学发现的文本和数据挖掘中的最新进展和新兴应用
7. Data Mining Technology Application in False Text Information Recognition [O] . Jie Wan, Xue Cao, Kun Yao, 2021

机译：数据挖掘技术应用于虚假文本信息识别
8. Science and Technology Text Mining: Text Mining of the Journal Cortex [R] . Kostoff, R. N. , Buchtel, H. A. , Andrews, J. , 2004

机译：科技文本挖掘：期刊皮质的文本挖掘

Data Mining Technology Application in False Text Information Recognition

摘要

著录项

相似文献

相关主题

期刊订阅