On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

Tran Khanh Dang; Duc Minh Chau Pham; Duc Dan Ho

首页> 外文期刊>International journal of web information systems >On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

【24h】

On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

机译：通过半交叉核对方法验证电子商务爬网数据的真实性

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Purpose - Data crawling in e-commerce for market research often come with the risk of poor authenticity due to modification attacks. The purpose of this paper is to propose a novel data authentication model for such systems. Design/methodology/approach - The data modification problem requires careful examinations in which the data are re-collected to verify their reliability by overlapping the two datasets. This approach is to use different anomaly detection techniques to determine which data are potential for frauds and to be recollected. The paper also proposes a data selection model using their weights of importance in addition to anomaly detection. The target is to significantly reduce the amount of data in need of verification, but still guarantee that they achieve their high authenticity. Empirical experiments are conducted with real-world datasets to evaluate the efficiency of the proposed scheme. Findings - The authors examine several techniques for detecting anomalies in the data of users and products, which give the accuracy of 80 per cent approximately. The integration with the weight selection model is also proved to be able to detect more than 80 per cent of the existing fraudulent ones while being careful not to accidentally include ones which are not, especially when the proportion of frauds is high. Originality/value - With the rapid development of e-commerce fields, fraud detection on their data, as well as in Web crawling systems is new and necessary for research. This paper contributes a novel approach in crawling systems data authentication problem which has not been studied much.

机译：目的-为了进行市场研究而在电子商务中进行数据爬网通常会带来由于修改攻击而导致真实性差的风险。本文的目的是为此类系统提出一种新颖的数据认证模型。设计/方法/方法-数据修改问题需要仔细检查，在其中重新收集数据以通过重叠两个数据集来验证其可靠性。这种方法是使用不同的异常检测技术来确定哪些数据可能存在欺诈行为并进行重新收集。本文还提出了使用数据权重的重要性以及异常检测的数据选择模型。目标是显着减少需要验证的数据量，但仍要保证它们具有很高的真实性。对真实数据集进行了实证实验，以评估所提出方案的效率。调查结果-作者研究了几种检测用户和产品数据异常的技术，这些技术的准确度约为80％。事实证明，与权重选择模型的集成可以检测到80％以上的现有欺诈行为，同时要注意不要意外地将不存在的欺诈行为包括在内，特别是在欺诈率很高的情况下。原创性/价值-随着电子商务领域的飞速发展，对其数据以及Web爬网系统中的欺诈检测是新的并且是研究所必需的。本文为爬网系统数据身份验证问题提供了一种新颖的方法，但尚未对此进行深入研究。

著录项

来源
《International journal of web information systems》 |2019年第4期|454-473|共20页
作者
Tran Khanh Dang; Duc Minh Chau Pham; Duc Dan Ho;
展开▼
作者单位

Ho Chi Minh City University of Technology Ho Chi Minh City Vietnam;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
E-commerce; Anomaly detection; Data authenticity; Fraud detection; Web crawling; Weighted data;

机译：电子商务;异常检测;数据真实性;欺诈识别;网络爬网;加权数据;

相似文献

外文文献
中文文献
专利

1. Methods for verifying the authenticity of hops - an effective tool against falsification [J] . Jana OLSOVSKA, Karel KROFTA, Vladimira JANDOVSKA, Kvasny prumysl . 2016,第10期

机译：验证啤酒花真伪的方法-一种防止伪造的有效工具
2. Optimization of Tea Sample Preparation Methods for ICP-MS and Application to Verification of Chinese Tea Authenticity [J] . Qiang Han, Shozo Mihara, Kenro Hashimoto, Food science and technology research . 2014,第6期

机译：ICP-MS茶叶样品制备方法的优化及其在中国茶叶真伪验证中的应用
3. Authenticity verification on social data outsourcing [J] . Haowen Chen, Qiang Qu, Yexiong Lin, Computers & Security . 2021,第Jana期

机译：社会数据外包的真实性验证
4. A Cross-Checking Based Method for Fraudulent Detection on E-Commercial Crawling Data [C] . Khanh Dang Tran, Duc Dan Ho, Duc Minh Chau Pham, International Conference on Advanced Computing and Applications . 2016

机译：基于交叉检查的电子商务爬网数据欺诈性检测方法
5. Verification of Fluorescence Excitation-Emission Data of Optical Multichannel Analyzer Detector Using a Commercial Spectrophotometer and Chemometric Methods [D] . Nguyen, Khue. 2020

机译：使用商业分光光度计和化学计量方法验证光学多通道分析仪检测器的荧光激发 - 发射数据
6. Food Phenotyping: Recording and Processing of Non-Targeted Liquid Chromatography Mass Spectrometry Data for Verifying Food Authenticity [O] . Marina Creydt, Markus Fischer 2020

机译：食品表型：非靶向液相色谱质谱数据的记录和加工用于验证食品真实性
7. Optimization of Tea Sample Preparation Methods for ICP-MS and Application to Verification of Chinese Tea Authenticity [O] . Qiang Han, Shozo Mihara, Kenro Hashimoto, 2014

机译：ICP-MS的茶草样品制备方法的优化及其在核查中茶叶真实性的应用
8. Generic Methodology for Verification and Validation (GM-VV) to Support Acceptance of Models, Simulations and Data (Methodologie generale de verification et de validation (GM-VV) visant a soutenir l acceptation des modeles, simulations et donnees). [R] . 2015

机译：验证和验证的通用方法（Gm-VV），以支持模型，模拟和数据的接受（methodologie generale de verification et de validation（Gm-VV）visant a soutenir l acceptation des modeles，simulations et donnees）。

On verifying the authenticity of e-commercial crawling data by a semi-crosschecking method

摘要

著录项

相似文献

相关主题

期刊订阅