nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

Linguistic Steganalysis Based on Clustering and Ensemble Learning in Imbalanced Scenario

verfasst von : Shengnan Guo, Xuekai Chen, Zhuang Wang, Zhongliang Yang, Linna Zhou

Erschienen in: Digital Forensics and Watermarking

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

With the rapid development of the Internet, more and more methods of text steganography have emerged. However, these methods are easily abused in public networks for malicious purposes, which poses a great threat to cyberspace security. At present, a large number of text steganalysis methods have been proposed to game with text steganography. However, existing methods typically assume a balanced class distribution. In reality, stego texts are far less than cover texts. How to accurately detect stego texts in massive texts becomes a challenge. In this paper, we propose a text steganalysis method based on an under-sample method and ensemble learning in imbalanced scenarios. Specifically, we introduce the thinking of clustering to under-sample the majority class samples (cover texts) based on the detection difficulty of the samples, in order to select samples with rich information. Ensemble learning is then used to ensemble the detection results of multiple base classifiers and guide the sampling process. We designed several experiments to test the detection performance of the proposed model. Experimental results show that the proposed model can effectively compensate for the deficiencies of existing methods, even in highly imbalanced datasets, the model can still detect stego texts effectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel VStego800K: Large-Scale Steganalysis Dataset for Streaming Voice

Barandela, R., Valdovinos, R.M., Sánchez, J.S.: New applications of ensembles of classifiers. Pattern Anal. Appl. 6, 245–256 (2003)MathSciNetCrossRef

Chen, Z., Huang, L., Miao, H., Yang, W., Meng, P.: Steganalysis against substitution-based linguistic steganography based on context clusters. Comput. Electr. Eng. 37(6), 1071–1081 (2011)CrossRef

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

Freund, Y.: Boosting a weak learning algorithm by majority. Inf. Comput. 121(2), 256–285 (1995)MathSciNetCrossRef

Galar, M., Fernández, A., Barrenechea, E., Herrera, F.: EUSBoost: enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recogn. 46(12), 3460–3471 (2013)CrossRef

Gao, L., Zhang, L., Liu, C., Wu, S.: Handling imbalanced medical image data: a deep-learning-based one-class classification approach. Artif. Intell. Med. 108, 101935 (2020)CrossRef

He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328. IEEE (2008)

Huang, Y.F., Tang, S., Yuan, J.: Steganography in inactive frames of VoIP streams encoded by source codec. IEEE Trans. Inf. Forensics Secur. 6(2), 296–306 (2011)CrossRef

Johnson, N.F., Sallee, P.A.: Detection of hidden information, covert channels and information flows. In: Wiley Handbook of Science and Technology for Homeland Security, pp. 1–37 (2008)

10.

Laurikkala, J.: Improving identification of difficult small classes by balancing class distribution. In: Quaglini, S., Barahona, P., Andreassen, S. (eds.) AIME 2001. LNCS, vol. 2101, pp. 63–66. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-48229-6_9CrossRef

11.

Li, S., Wang, J., Liu, P.: Detection of generative linguistic steganography based on explicit and latent text word relation mining using deep learning. IEEE Trans. Dependable Secure Comput. 20(2), 1476–1487 (2022)CrossRef

12.

Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2008)

13.

Liu, Y., Chawla, N.V., Harper, M.P., Shriberg, E., Stolcke, A.: A study in machine learning from imbalanced data for sentence boundary detection in speech. Comput. Speech Lang. 20(4), 468–494 (2006)CrossRef

14.

Liu, Z., Wei, P., Jiang, J., Cao, W., Bian, J., Chang, Y.: MESA: boost ensemble imbalanced learning with meta-sampler. In: Advances in Neural Information Processing Systems, vol. 33, pp. 14463–14474 (2020)

15.

Niu, Y., Wen, J., Zhong, P., Xue, Y.: A hybrid R-BILSTM-C neural network based text steganalysis. IEEE Sig. Process. Lett. 26(12), 1907–1911 (2019)CrossRef

16.

Samanta, S., Dutta, S., Sanyal, G.: A real time text steganalysis by using statistical method. In: 2016 IEEE International Conference on Engineering and Technology (ICETECH), pp. 264–268. IEEE (2016)

17.

Seiffert, C., Khoshgoftaar, T.M., Van Hulse, J., Napolitano, A.: RUSBoost: a hybrid approach to alleviating class imbalance. IEEE Trans. Syst. Man Cybern.-Part A Syst. Hum. 40(1), 185–197 (2009)CrossRef

18.

Sun, B., Chen, H., Wang, J., Xie, H.: Evolutionary under-sampling based bagging ensemble method for imbalanced data classification. Front. Comput. Sci. 12, 331–350 (2018)CrossRef

19.

Sun, Z., Song, Q., Zhu, X., Sun, H., Xu, B., Zhou, Y.: A novel ensemble method for classifying imbalanced data. Pattern Recogn. 48(5), 1623–1637 (2015)CrossRef

20.

Tang, W., Li, B., Tan, S., Barni, M., Huang, J.: CNN-based adversarial embedding for image steganography. IEEE Trans. Inf. Forensics Secur. 14(8), 2074–2087 (2019)CrossRef

21.

Wang, Y., Zhang, W., Li, W., Yu, X., Yu, N.: Non-additive cost functions for color image steganography based on inter-channel correlations and differences. IEEE Trans. Inf. Forensics Secur. 15, 2081–2095 (2019)CrossRef

22.

Wang, Y., Gan, W., Yang, J., Wu, W., Yan, J.: Dynamic curriculum learning for imbalanced data classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5017–5026 (2019)

23.

Wu, H., Yi, B., Ding, F., Feng, G., Zhang, X.: Linguistic steganalysis with graph neural networks. IEEE Sig. Process. Lett. 28, 558–562 (2021)CrossRef

24.

Xiang, L., Sun, X., Luo, G., Xia, B.: Linguistic steganalysis using the features derived from synonym frequency. Multimedia Tools Appl. 71, 1893–1911 (2014)CrossRef

25.

Yang, H., Bao, Y., Yang, Z., Liu, S., Huang, Y., Jiao, S.: Linguistic steganalysis via densely connected LSTM with feature pyramid. In: Proceedings of the 2020 ACM Workshop on Information Hiding and Multimedia Security, pp. 5–10 (2020)

26.

Yang, H., Cao, X.: Linguistic steganalysis based on meta features and immune mechanism. Chin. J. Electron. 19(4), 661–666 (2010)

27.

Yang, J., Yang, Z., Zhang, S., Tu, H., Huang, Y.: SeSy: linguistic steganalysis framework integrating semantic and syntactic features. IEEE Sig. Process. Lett. 29, 31–35 (2021)CrossRef

28.

Yang, Z.L., Guo, X.Q., Chen, Z.M., Huang, Y.F., Zhang, Y.J.: RNN-Stega: linguistic steganography based on recurrent neural networks. IEEE Trans. Inf. Forensics Secur. 14(5), 1280–1295 (2018)CrossRef

29.

Yang, Z., Du, X., Tan, Y., Huang, Y., Zhang, Y.J.: AAG-Stega: automatic audio generation-based steganography. arXiv preprint arXiv:1809.03463 (2018)

30.

Yang, Z., Huang, Y., Zhang, Y.J.: A fast and efficient text steganalysis method. IEEE Sig. Process. Lett. 26(4), 627–631 (2019)CrossRef

31.

Yang, Z., Huang, Y., Zhang, Y.J.: TS-CSW: text steganalysis and hidden capacity estimation based on convolutional sliding windows. Multimedia Tools Appl. 79, 18293–18316 (2020)CrossRef

32.

Zhang, S., Yang, Z., Yang, J., Huang, Y.: Provably secure generative linguistic steganography. arXiv preprint arXiv:2106.02011 (2021)

33.

Zhou, F., et al.: Dynamic self-paced sampling ensemble for highly imbalanced and class-overlapped data classification. Data Min. Knowl. Disc. 36(5), 1601–1622 (2022)MathSciNetCrossRef

34.

Ziegler, Z.M., Deng, Y., Rush, A.M.: Neural linguistic steganography. arXiv preprint arXiv:1909.01496 (2019)

35.

Zou, J., Yang, Z., Zhang, S., ur Rehman, S., Huang, Y.: High-performance linguistic steganalysis, capacity estimation and steganographic positioning. In: Zhao, X., Shi, Y.Q., Piva, A., Kim, H.J. (eds.) IWDW 2020. LNSC, vol. 12617, pp. 80–93. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-69449-4_7CrossRef

Titel: Linguistic Steganalysis Based on Clustering and Ensemble Learning in Imbalanced Scenario
verfasst von: Shengnan Guo
Xuekai Chen
Zhuang Wang
Zhongliang Yang
Linna Zhou
Verlag: Springer Nature Singapore
Buch: Digital Forensics and Watermarking
Print ISBN: 978-981-9725-84-7

Electronic ISBN: 978-981-9725-85-4

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2585-4_22

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner