A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web

机译：一种用于在黑色Web中打破基于文本的CAPTCHA的生成的对抗性学习框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.

机译：网络威胁情报（CTI）需要大规模地自动监控暗网平台（例如，黑暗网市场和梳理商店）。虽然存在从表面纤维网收集数据的现有方法，但大规模的暗网络数据收集通常通过防爬爬措施阻碍。基于文本的CAPTCHA作为这些措施最令人禁止的类型。基于文本的CAPTCHA要求用户识别难以读取字符的组合。暗网CAPTCHA模式是有意的，可以具有额外的背景噪声和可变性格长度，以防止自动验证码断裂。现有的CAPTCHA破解方法不能纠正这些挑战，因此不适用于暗网。在这项研究中，我们提出了一种在暗网中打破基于文本的CAPTCHA的新框架。所提出的框架利用生成的对抗网络（GAN）来抵消暗网上的背景噪声并利用增强的字符分割算法。我们提出的方法是在基准测试和暗网CAPTCHA测试台上进行评估。所提出的方法显着优于所有数据集的最先进的基线方法，在暗网测试台上实现了92.08％的成功率。我们的研究使CTI社区能够开发大规模黑网络监控的高级功能。

著录项

来源
《IEEE International Conference on Intelligence and Security Informatics》|2020年|1-6|共6页
会议地点
作者
Ning Zhang; Mohammadreza Ebrahimi; Weifeng Li; Hsinchun Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
automated CAPTCHA breaking; dark web; generative adversarial networks; cyber threat intelligence;

机译：自动验证码打破;暗网;生成的对抗网络;网络威胁情报;

相似文献

外文文献
中文文献
专利

1. End-to-end attack on text-based CAPTCHAs based on cycle-consistent generative adversarial network [J] . Li Chunhui, Chen Xingshu, Wang Haizhou, Neurocomputing . 2021,第Apra14期

机译：基于循环一致的生成对抗网络的基于文本的CAPTCHA的端到端攻击
2. A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [J] . George Dileep, Lehrach Wolfgang, Kansky Ken, Science . 2017,第6368期

机译：具有生成能力的视觉模型，可进行高数据效率的训练并打破基于文本的验证码
3. Using Generative Adversarial Networks to Break and Protect Text Captchas [J] . Ye Guixin, Tang Zhanyong, Fang Dingyi, ACM transactions on privacy and security . 2020,第2期

机译：使用生成的对抗性网络来破坏和保护文本验证码
4. A Text-Based CAPTCHA Cracking System with Generative Adversarial Networks [C] . Fan Liu, Zewen Li, Xueyi Li, IEEE International Symposium on Multimedia . 2018

机译：具有生成对抗网络的基于文本的CAPTCHA破解系统
5. Stacked Generative Adversarial Networks for Learning Additional Features of Image Segmentation Maps [D] . Burke, Matthew. 2020

机译：用于学习图像分割图的其他特征的堆叠生成的对抗网络
6. Gaze in the Dark: Gaze Estimation in a Low-Light Environment with Generative Adversarial Networks [O] . Jung-Hwa Kim, Jin-Woo Jeong 2020

机译：凝视着暗示：具有生成对抗网络的低光环境中的凝视估计
7. DIVINE: A Generative Adversarial Imitation Learning Framework for Knowledge Graph Reasoning [O] . Ruiping Li, Xiang Cheng 2019

机译：神圣：知识图推理的生成对抗仿制框架

A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web

摘要

著录项

相似文献

相关主题

期刊订阅