首页> 外文期刊>Technological forecasting and social change >Programmers' de-anonymization using a hybrid approach of abstract syntax tree and deep learning
【24h】

Programmers' de-anonymization using a hybrid approach of abstract syntax tree and deep learning

机译:程序员使用抽象语法树和深度学习的混合方法进行匿名化

获取原文
获取原文并翻译 | 示例
           

摘要

Source Code Authorship Attribution (SCAA) is a direct challenge to the privacy and anonymity of developers. However, it is important to recognize the malicious authors and the origin of the attack. In this paper, we proposed Source Code Authorship Attribution using Abstract Syntax Tree (SCAA-AST) for efficient classification of programmers. First, the AST hierarchal features are generated from different programming codes. Second, preprocessing techniques are used to obtain useful features without sound data. Third, the Term Frequency Inverse Document Frequency (TFIDF) weighting technique is used to zoom in on the significance of each feature. Fourth, the Adaptive Synthetic (ADASYN) oversampling method is used to solve the imbalanced class problem. Finally, a deep learning algorithm is designed with the TensorFlow framework, and the Keras API is used to classify programming authors. A deep learning algorithm is further configured with a dropout layer, learning error rate, loss and activation function, and dense layers to enhance the classification results. The results are appreciable in outperforming the existing techniques from the perspective of classification accuracy.
机译:源代码作者归因(SCAA)是对开发人员的隐私和匿名的直接挑战。但是,重要的是要认识到恶意作者和袭击的起源。在本文中,我们使用抽象语法树(SCAA-AST)提出了源代码作者归属,以实现程序员的有效分类。首先,AST层次结构由不同的编程代码生成。其次,使用预处理技术来获得没有声音数据的有用功能。第三,术语频率逆文档频率(TFIDF)加权技术用于放大每个特征的重要性。第四,使用自适应合成(Adasyn)过采样方法来解决不平衡的课堂问题。最后,使用Tensorflow框架设计了深度学习算法,并且Keras API用于对编程作者进行分类。深度学习算法还具有丢弃层,学习错误率,丢失和激活功能,以及密集层,以增强分类结果。从分类准确性的角度表现出现有技术,可以显着。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号