Effects of Pre-trained Word Embeddings on Text-based Deception Detection

机译：预训练词嵌入对基于文本的欺骗检测的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With e-commerce transforming the way in which individuals and businesses conduct trades, online reviews have become a great source of information among consumers. With 93% of shoppers relying on online reviews to make their purchasing decisions, the credibility of reviews should be strongly considered. While detecting deceptive text has proven to be a challenge for humans to detect, it has been shown that machines can be better at distinguishing between truthful and deceptive online information by applying pattern analysis on a large amount of data. In this work, we look at the use of several popular pre-trained word embeddings (Word2Vec, GloVe, fastText) with deep neural network models (CNN, BiLSTM, CNN-BiLSTM) to determine the influence of word embedding on the accuracy of detecting deception. Some pre-trained word embeddings have shown to adversely affect the classification accuracy when compared to training the model on text embedding using the domain specific data. Through the combination of CNN and BiLSTM along with the fastText pre-trained word embeddings, we were able to achieve an accuracy of 88.8 percent on the hotel review dataset published by Ott et al. in 2011.

机译：随着电子商务改变了个人和企业进行交易的方式，在线评论已成为消费者中重要的信息来源。由于93％的购物者依靠在线评论来做出购买决定，因此应该强烈考虑评论的可信度。虽然检测欺骗性文本已证明是人类难以检测的挑战，但事实表明，通过对大量数据进行模式分析，机器可以更好地区分真实的和欺骗性的在线信息。在这项工作中，我们着眼于使用几种流行的预训练词嵌入（Word2Vec，GloVe，fastText）和深度神经网络模型（CNN，BiLSTM，CNN-BiLSTM）来确定词嵌入对检测准确性的影响欺骗。与使用领域特定数据在文本嵌入上训练模型相比，某些预训练的词嵌入已显示出对分类准确性的不利影响。通过CNN和BiLSTM的结合以及fastText预训练的单词嵌入，在Ott等人发表的酒店评论数据集上，我们能够达到88.8％的准确性。在2011年。

著录项

来源
《International Conference on Dependable, Autonomic and Secure Computing;International Conference on Pervasive Intelligence and Computing;International Conference on Cloud and Big Data Computing;Cyber Science and Technology Congress》|2020年|437-443|共7页
会议地点
作者
David Nam; Jerin Yasmin; Farhana Zulkernine;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Artificial Neural Network; Natural Language Pro-cessing; Deception Detection; Online Reviews; Deep Learning; Convolutional Neural Network; Long Short Term Memory; Word Embeddings;

机译：人工神经网络;自然语言处理;欺骗检测;在线评论;深度学习;卷积神经网络;长期短期记忆;词嵌入;

相似文献

外文文献
中文文献
专利

1. Transformer based contextualization of pre-trained word embeddings for irony detection in Twitter [J] . Jose Angel Gonzalez, Lluis-F. Hurtado, Ferran Pla Information Processing & Management . 2020,第4期

机译：基于变压器的预训练Word Embeddings的上下文化，在Twitter中进行讽刺检测
2. Spatial Role Labeling based on Improved Pre-trained Word Embeddings and Transfer Learning [J] . Alaeddine Moussa, Sébastien Fournier, Khaoula Mahmoudi, Procedia Computer Science . 2021,第a期

机译：基于改进的预训练单词嵌入和转移学习的空间角色标记
3. Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification [J] . Physica, A. Statistical mechanics and its applications . 2020,第期

机译：使用预训练的单词嵌入在土耳其语文本分类的深神经网络上使用预先训练的单词嵌入来提高准确性
4. LSTM Easy-first Dependency Parsing with Pre-trained Word Embeddings and Character-level Word Embeddings in Vietnamese [C] . Binh Duc Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen International Conference on Knowledge and Systems Engineering . 2018

机译：具有越南语预训练词嵌入和字符级词嵌入的LSTM易优先依赖分析
5. Parallel Sentence Detection in Comparable Corpora with Bilingual Word Embeddings for Low-Resource Languages [D] . Cadigan, John. 2018

机译：与低资源语言的双语单词嵌入式的同类语料中的并行句子检测
6. Protein-Protein Interaction Article Classification Using a Convolutional Recurrent Neural Network with Pre-trained Word Embeddings [O] . Sérgio Matos, Rui Antunes 2017

机译：使用带预训练词嵌入的卷积递归神经网络进行蛋白质与蛋白质相互作用的文章分类
7. An Exploratory Study into Deception Detection in Text-based Computer-Mediated Communication [O] . Lina Zhou, Douglas P. Twitchell, Tiantian Qin, 2003

机译：基于文本的计算机介导通信中欺骗检测的探索性研究

Effects of Pre-trained Word Embeddings on Text-based Deception Detection

摘要

著录项

相似文献

相关主题

期刊订阅