首页> 外文会议>IEEE International Conference on Intelligence and Security Informatics >A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web
【24h】

A Generative Adversarial Learning Framework for Breaking Text-Based CAPTCHA in the Dark Web

机译:一种用于在黑色Web中打破基于文本的CAPTCHA的生成的对抗性学习框架

获取原文

摘要

Cyber threat intelligence (CTI) necessitates automated monitoring of dark web platforms (e.g., Dark Net Markets and carding shops) on a large scale. While there are existing methods for collecting data from the surface web, large-scale dark web data collection is commonly hindered by anti-crawling measures. Text-based CAPTCHA serves as the most prohibitive type of these measures. Text-based CAPTCHA requires the user to recognize a combination of hard-to-read characters. Dark web CAPTCHA patterns are intentionally designed to have additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing CAPTCHA breaking methods cannot remedy these challenges and are therefore not applicable to the dark web. In this study, we propose a novel framework for breaking text-based CAPTCHA in the dark web. The proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web-specific background noise and leverages an enhanced character segmentation algorithm. Our proposed method was evaluated on both benchmark and dark web CAPTCHA testbeds. The proposed method significantly outperformed the state-of-the-art baseline methods on all datasets, achieving over 92.08% success rate on dark web testbeds. Our research enables the CTI community to develop advanced capabilities of large-scale dark web monitoring.
机译:网络威胁情报(CTI)需要大规模地自动监控暗网平台(例如,黑暗网市场和梳理商店)。虽然存在从表面纤维网收集数据的现有方法,但大规模的暗网络数据收集通常通过防爬爬措施阻碍。基于文本的CAPTCHA作为这些措施最令人​​禁止的类型。基于文本的CAPTCHA要求用户识别难以读取字符的组合。暗网CAPTCHA模式是有意的,可以具有额外的背景噪声和可变性格长度,以防止自动验证码断裂。现有的CAPTCHA破解方法不能纠正这些挑战,因此不适用于暗网。在这项研究中,我们提出了一种在暗网中打破基于文本的CAPTCHA的新框架。所提出的框架利用生成的对抗网络(GAN)来抵消暗网上的背景噪声并利用增强的字符分割算法。我们提出的方法是在基准测试和暗网CAPTCHA测试台上进行评估。所提出的方法显着优于所有数据集的最先进的基线方法,在暗网测试台上实现了92.08%的成功率。我们的研究使CTI社区能够开发大规模黑网络监控的高级功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号