Skip to main content

2024 | OriginalPaper | Buchkapitel

From Deconstruction to Reconstruction: A Plug-In Module for Diffusion-Based Purification of Adversarial Examples

verfasst von : Erjin Bao, Ching-Chun Chang, Huy H. Nguyen, Isao Echizen

Erschienen in: Digital Forensics and Watermarking

Verlag: Springer Nature Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

As the use and reliance on AI technologies continue to proliferate, there is mounting concern regarding adversarial example attacks, emphasizing the pressing necessity for robust defense strategies to protect AI systems from malicious input manipulation. In this paper, we introduce a computationally efficient plug-in module, seamlessly integrable with advanced diffusion models for purifying adversarial examples. Drawing inspiration from the concept of deconstruction and reconstruction (DR), our module decomposes an input image into foundational visual features expected to exhibit robustness against adversarial perturbations and subsequently rebuilds the image using an image-to-image transformation neural network. Through the collaborative integration of the module with an advanced diffusion model, this combination attains state-of-the-art performance in effectively purifying adversarial examples while preserving high classification accuracy on clean image samples. The model performance is evaluated on representative neural network classifiers pre-trained and fine-tuned on large-scale datasets. An ablation study analyses the impact of the proposed plug-in module on enhancing the effectiveness of diffusion-based purification. Furthermore, it is noteworthy that the module demonstrates significant computational efficiency, incurring only minimal computational overhead during the purification process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Srivastava, M.B., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of Conference on Empirical Methods Natural Language Processing (EMNLP) (2018) Alzantot, M., Sharma, Y., Elgohary, A., Ho, B.J., Srivastava, M.B., Chang, K.W.: Generating natural language adversarial examples. In: Proceedings of Conference on Empirical Methods Natural Language Processing (EMNLP) (2018)
2.
Zurück zum Zitat Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Proceedings of International Conference on Machine Learning (ICML) (2018) Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: Proceedings of International Conference on Machine Learning (ICML) (2018)
3.
Zurück zum Zitat Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: reliable attacks against black-box machine learning models. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Brendel, W., Rauber, J., Bethge, M.: Decision-based adversarial attacks: reliable attacks against black-box machine learning models. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
4.
Zurück zum Zitat Carlini, N., et al.: Hidden voice commands. In: Proceedings of USENIX Security Symposium (USENIX Security) (2016) Carlini, N., et al.: Hidden voice commands. In: Proceedings of USENIX Security Symposium (USENIX Security) (2016)
5.
Zurück zum Zitat Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: Proceedings of IEEE Symposium on Security and Privacy (SP) (2017) Carlini, N., Wagner, D.: Towards evaluating the robustness of neural networks. In: Proceedings of IEEE Symposium on Security and Privacy (SP) (2017)
6.
Zurück zum Zitat Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of ACM Workshop Artificial Intellgient Security (AISec) (2017) Chen, P.Y., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.J.: Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: Proceedings of ACM Workshop Artificial Intellgient Security (AISec) (2017)
7.
Zurück zum Zitat Croce, F., et al.: Robustbench: a standardized adversarial robustness benchmark. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2021) Croce, F., et al.: Robustbench: a standardized adversarial robustness benchmark. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2021)
8.
Zurück zum Zitat Dhillon, G.S., et al.: Stochastic activation pruning for robust adversarial defense. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Dhillon, G.S., et al.: Stochastic activation pruning for robust adversarial defense. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
9.
Zurück zum Zitat Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of IEEE Conference on Computer Vision on Pattern Recognition (CVPR) (2018) Dong, Y., et al.: Boosting adversarial attacks with momentum. In: Proceedings of IEEE Conference on Computer Vision on Pattern Recognition (CVPR) (2018)
10.
Zurück zum Zitat Dong, Y., et al.: Efficient decision-based black-box adversarial attacks on face recognition. In: Proceedings of IEEE Conference on Computer Vision on Pattern Recognition (CVPR) (2019) Dong, Y., et al.: Efficient decision-based black-box adversarial attacks on face recognition. In: Proceedings of IEEE Conference on Computer Vision on Pattern Recognition (CVPR) (2019)
11.
Zurück zum Zitat Dziugaite, G.K., Ghahramani, Z., Roy, D.M.: A study of the effect of JPG compression on adversarial images. arXiv preprint arXiv:1608.00853 (2016) Dziugaite, G.K., Ghahramani, Z., Roy, D.M.: A study of the effect of JPG compression on adversarial images. arXiv preprint arXiv:​1608.​00853 (2016)
12.
Zurück zum Zitat Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Eykholt, K., et al.: Robust physical-world attacks on deep learning visual classification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
13.
Zurück zum Zitat Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2015) Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2015)
14.
Zurück zum Zitat Guo, C., Rana, M., Cisse, M., van der Maaten, L.: Countering adversarial images using input transformations. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Guo, C., Rana, M., Cisse, M., van der Maaten, L.: Countering adversarial images using input transformations. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
15.
Zurück zum Zitat Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: Proceedings of International Conference on Learning Representations (ICLR) (2017) Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
16.
Zurück zum Zitat Hendrycks, D., Gimpel, K.: Early methods for detecting adversarial images. In: Proceedings of International Conference on Learning Representations Workshop (ICLR) (2017) Hendrycks, D., Gimpel, K.: Early methods for detecting adversarial images. In: Proceedings of International Conference on Learning Representations Workshop (ICLR) (2017)
17.
Zurück zum Zitat Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. In: Proceedings of International Conference on Machine Learning (ICML) (2018) Ilyas, A., Engstrom, L., Athalye, A., Lin, J.: Black-box adversarial attacks with limited queries and information. In: Proceedings of International Conference on Machine Learning (ICML) (2018)
18.
Zurück zum Zitat Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2019) Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2019)
19.
Zurück zum Zitat Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: Proceedings of International Conference on Learning Representations Workshop (ICLR) (2017) Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: Proceedings of International Conference on Learning Representations Workshop (ICLR) (2017)
20.
Zurück zum Zitat Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR) (2018) Liao, F., Liang, M., Dong, Y., Pang, T., Hu, X., Zhu, J.: Defense against adversarial attacks using high-level representation guided denoiser. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR) (2018)
21.
Zurück zum Zitat Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arxiv:1706.06083 (2017) Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. arxiv:​1706.​06083 (2017)
22.
Zurück zum Zitat Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: Proceedings of International Conference on Learning Representations (ICLR) (2017) Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On detecting adversarial perturbations. In: Proceedings of International Conference on Learning Representations (ICLR) (2017)
23.
Zurück zum Zitat Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
24.
Zurück zum Zitat Nie, W., Guo, B., Huang, Y., Xiao, C., Vahdat, A., Anandkumar, A.: Diffusion models for adversarial purification. In: Proceedings of International Conference on Machine Learning (ICML) (2022) Nie, W., Guo, B., Huang, Y., Xiao, C., Vahdat, A., Anandkumar, A.: Diffusion models for adversarial purification. In: Proceedings of International Conference on Machine Learning (ICML) (2022)
25.
Zurück zum Zitat Pang, T., Xu, K., Zhu, J.: Mixup inference: better exploiting mixup to defend adversarial attacks. In: Proceedings of International Conference on Learning Representations (ICLR) (2020) Pang, T., Xu, K., Zhu, J.: Mixup inference: better exploiting mixup to defend adversarial attacks. In: Proceedings of International Conference on Learning Representations (ICLR) (2020)
26.
Zurück zum Zitat Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
27.
Zurück zum Zitat Rauber, J., Brendel, W., Bethge, M.: Foolbox: a python toolbox to benchmark the robustness of machine learning models. In: Proceedings of International Conference on Machine Learning (ICML) (2017) Rauber, J., Brendel, W., Bethge, M.: Foolbox: a python toolbox to benchmark the robustness of machine learning models. In: Proceedings of International Conference on Machine Learning (ICML) (2017)
28.
Zurück zum Zitat Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: protecting classifiers against adversarial attacks using generative models. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Samangouei, P., Kabkab, M., Chellappa, R.: Defense-GAN: protecting classifiers against adversarial attacks using generative models. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
29.
Zurück zum Zitat Shafahi, A., et al.: Adversarial training for free! In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2019) Shafahi, A., et al.: Adversarial training for free! In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2019)
30.
Zurück zum Zitat Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of ACM SIGSAC Conference on Computer Communication Security (CCS) (2016) Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime: real and stealthy attacks on state-of-the-art face recognition. In: Proceedings of ACM SIGSAC Conference on Computer Communication Security (CCS) (2016)
31.
Zurück zum Zitat Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: A general framework for adversarial examples with objectives. ACM Trans. Priv. Secur. (TOPS) 22(3), 1–30 (2019)CrossRef Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: A general framework for adversarial examples with objectives. ACM Trans. Priv. Secur. (TOPS) 22(3), 1–30 (2019)CrossRef
32.
Zurück zum Zitat Sinha, A., Namkoong, H., Duchi, J.: Certifying some distributional robustness with principled adversarial training. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Sinha, A., Namkoong, H., Duchi, J.: Certifying some distributional robustness with principled adversarial training. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
33.
Zurück zum Zitat Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Song, Y., Kim, T., Nowozin, S., Ermon, S., Kushman, N.: Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
34.
Zurück zum Zitat Szegedy, C., et al.: Intriguing properties of neural networks. In: Proceedings of International Conference on Learning Representations (ICLR) (2014) Szegedy, C., et al.: Intriguing properties of neural networks. In: Proceedings of International Conference on Learning Representations (ICLR) (2014)
35.
Zurück zum Zitat Uesato, J., O’Donoghue, B., van den Oord, A., Kohli, P.: Adversarial risk and the dangers of evaluating against weak attacks. In: Proceedings of International Conference on Machine Learning (ICML) (2018) Uesato, J., O’Donoghue, B., van den Oord, A., Kohli, P.: Adversarial risk and the dangers of evaluating against weak attacks. In: Proceedings of International Conference on Machine Learning (ICML) (2018)
36.
Zurück zum Zitat Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Proceedings of International Conference on Machine Learning (ICML) (2018) Wong, E., Kolter, Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: Proceedings of International Conference on Machine Learning (ICML) (2018)
37.
Zurück zum Zitat Wong, E., Schmidt, F., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2018) Wong, E., Schmidt, F., Metzen, J.H., Kolter, J.Z.: Scaling provable adversarial defenses. In: Proceedings of Advance Neural Information Processing System (NeurIPS) (2018)
38.
Zurück zum Zitat Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial networks. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI) (2018) Xiao, C., Li, B., Zhu, J.Y., He, W., Liu, M., Song, D.: Generating adversarial examples with adversarial networks. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI) (2018)
39.
Zurück zum Zitat Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. In: Proceedings of International Conference on Learning Representations (ICLR) (2018) Xie, C., Wang, J., Zhang, Z., Ren, Z., Yuille, A.: Mitigating adversarial effects through randomization. In: Proceedings of International Conference on Learning Representations (ICLR) (2018)
40.
Zurück zum Zitat Xie, C., et al.: Improving transferability of adversarial examples with input diversity. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR) (2019) Xie, C., et al.: Improving transferability of adversarial examples with input diversity. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (CVPR) (2019)
41.
Zurück zum Zitat Xu, W., Evans, D., Qi, Y.: Feature squeezing: detecting adversarial examples in deep neural networks. In: Proceedings of Network Distribution System on Security Symposium (NDSS) (2018) Xu, W., Evans, D., Qi, Y.: Feature squeezing: detecting adversarial examples in deep neural networks. In: Proceedings of Network Distribution System on Security Symposium (NDSS) (2018)
42.
Zurück zum Zitat Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30, 2805–2824 (2019)MathSciNetCrossRef Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30, 2805–2824 (2019)MathSciNetCrossRef
43.
Zurück zum Zitat Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: accelerating adversarial training via maximal principle. In: Proceedings of Advances Neural Information Processing System (NeurIPS) (2019) Zhang, D., Zhang, T., Lu, Y., Zhu, Z., Dong, B.: You only propagate once: accelerating adversarial training via maximal principle. In: Proceedings of Advances Neural Information Processing System (NeurIPS) (2019)
Metadaten
Titel
From Deconstruction to Reconstruction: A Plug-In Module for Diffusion-Based Purification of Adversarial Examples
verfasst von
Erjin Bao
Ching-Chun Chang
Huy H. Nguyen
Isao Echizen
Copyright-Jahr
2024
Verlag
Springer Nature Singapore
DOI
https://doi.org/10.1007/978-981-97-2585-4_4

Premium Partner