nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

SALSA: Salience-Based Switching Attack for Adversarial Perturbations in Fake News Detection Models

verfasst von : Chahat Raj, Anjishnu Mukherjee, Hemant Purohit, Antonios Anastasopoulos, Ziwei Zhu

Erschienen in: Advances in Information Retrieval

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Despite advances in fake news detection algorithms, recent research reveals that machine learning-based fake news detection models are still vulnerable to carefully crafted adversarial attacks. In this landscape, traditional methods, often relying on text perturbations or heuristic-based approaches, have proven insufficient, revealing a critical need for more nuanced and context-aware strategies to enhance the robustness of fake news detection. Our research identifies and addresses three critical areas: creating subtle perturbations, preserving core information while modifying sentence structure, and incorporating inherent interpretability. We propose SALSA, an adversarial Salience-based Switching Attack strategy that harnesses salient words, using similarity-based switching to address the shortcomings of traditional adversarial attack methods. Using SALSA, we perform a two-way attack: misclassifying real news as fake and fake news as real. Due to the absence of standardized metrics to evaluate adversarial attacks in fake news detection, we further propose three new evaluation metrics to gauge the attack’s success. Finally, we validate the transferability of our proposed attack strategy across attacker and victim models, demonstrating our approach’s broad applicability and potency. Code and data are available here at https://github.com/iamshnoo/salsa.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Measuring Bias in Search Results Through Retrieval List Comparison

Nächstes Kapitel Federated Conversational Recommender Systems

Nur mit Berechtigung zugänglich

https://huggingface.co/distilbert-base-uncased.

https://huggingface.co/roberta-base.

https://huggingface.co/datasets/GonzaloA/fake_news

Ali, H., et al.: All your fake detector are belong to us: evaluating adversarial robustness of fake-news detectors under black-box settings. IEEE Access 9, 81678–81692 (2021)CrossRef

Chang, G., Gao, H., Yao, Z., Xiong, H.: TextGuise: adaptive adversarial example attacks on text classification model. Neurocomputing 529, 190–203 (2023)CrossRef

Ebrahimi, J., Rao, A., Lowd, D., Dou, D.: HotFlip: white-box adversarial examples for text classification. arXiv arXiv:1712.06751 (2017)

Flores, L.J.Y., Hao, Y.: An adversarial benchmark for fake news detection models. arXiv arXiv:2201.00912 (2022)

Gao, J., Lanchantin, J., Soffa, M.L., Qi, Y.: Black-box generation of adversarial text sequences to evade deep learning classifiers. In: 2018 IEEE Security and Privacy Workshops (SPW), pp. 50–56. IEEE (2018)

Ghaffari Laleh, N., et al.: Adversarial attacks and adversarial robustness in computational pathology. Nat. Commun. 13(1), 5711 (2022)CrossRef

Horne, B.D., Nørregaard, J., Adali, S.: Robust fake news detection over time and attack. ACM Trans. Intell. Syst. Technol. (TIST) 11(1), 1–23 (2019)

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv arXiv:1412.6980 (2014)

10.

Koenders, C., Filla, J., Schneider, N., Woloszyn, V.: How vulnerable are automatic fake news detection methods to adversarial attacks? arXiv arXiv:2107.07970 (2021)

11.

Li, J., Ji, S., Du, T., Li, B., Wang, T.: TextBugger: generating adversarial text against real-world applications. arXiv arXiv:1812.05271 (2018)

12.

Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

13.

Morris, J.X., Lifland, E., Yoo, J.Y., Grigsby, J., Jin, D., Qi, Y.: TextAttack: a framework for adversarial attacks, data augmentation, and adversarial training in NLP. arXiv arXiv:2005.05909 (2020)

14.

Nørregaard, J., Horne, B.D., Adalı, S.: NELA-GT-2018: a large multi-labelled news dataset for the study of misinformation in news articles. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 13, pp. 630–638 (2019)

15.

Oshikawa, R., Qian, J., Wang, W.Y.: A survey on natural language processing for fake news detection. arXiv arXiv:1811.00770 (2018)

16.

Pan, L., Hang, C.W., Sil, A., Potdar, S.: Improved text classification via contrastive adversarial training. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, pp. 11130–11138 (2022)

17.

Pruthi, D., Dhingra, B., Lipton, Z.C.: Combating adversarial misspellings with robust word recognition. arXiv arXiv:1905.11268 (2019)

18.

Ren, S., Deng, Y., He, K., Che, W.: Generating natural language adversarial examples through probability weighted word saliency. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, July 2019, pp. 1085–1097. Association for Computational Linguistics (2019)

19.

Sanh, V., Debut, L., Chaumond, J., Wolf, T.: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. CoRR abs/1910.01108 (2019)

20.

Shu, K., Cui, L., Wang, S., Lee, D., Liu, H.: dEFEND: explainable fake news detection. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 395–405 (2019)

21.

Shu, K., Sliva, A., Wang, S., Tang, J., Liu, H.: Fake news detection on social media: a data mining perspective. ACM SIGKDD Explor. Newsl. 19(1), 22–36 (2017)CrossRef

22.

Shu, K., Wang, S., Liu, H.: Beyond news contents: the role of social context for fake news detection. In: Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, pp. 312–320 (2019)

23.

Simoncini, W., Spanakis, G.: SeqAttack: on adversarial attacks for named entity recognition. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 308–318 (2021)

24.

Wolf, T., et al.: Transformers: State-of-the-Art Natural Language Processing, October 2020, pp. 38–45 (2020)

25.

Xu, K., et al.: Structured adversarial attack: towards general implementation and better interpretability. arXiv arXiv:1808.01664 (2018)

26.

Zeng, G., et al.: OpenAttack: an open-source textual adversarial attack toolkit. arXiv arXiv:2009.09191 (2020)

27.

Zhang, X., Ghorbani, A.A.: An overview of online fake news: characterization, detection, and discussion. Inf. Process. Manage. 57(2), 102025 (2020)CrossRef

28.

Zhou, X., Zafarani, R.: A survey of fake news: fundamental theories, detection methods, and opportunities. ACM Comput. Surv. (CSUR) 53(5), 1–40 (2020)CrossRef

29.

Zhou, Z., Guan, H., Bhat, M.M., Hsu, J.: Fake news detection via NLP is vulnerable to adversarial attacks. arXiv arXiv:1901.09657 (2019)

Titel: SALSA: Salience-Based Switching Attack for Adversarial Perturbations in Fake News Detection Models
verfasst von: Chahat Raj
Anjishnu Mukherjee
Hemant Purohit
Antonios Anastasopoulos
Ziwei Zhu
Verlag: Springer Nature Switzerland
Buch: Advances in Information Retrieval
Print ISBN: 978-3-031-56068-2

Electronic ISBN: 978-3-031-56069-9

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-56069-9_3

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner