nach oben

Journal of Hardware and Systems Security

15.03.2024

Guarding Against the Unknown: Deep Transfer Learning for Hardware Image-Based Malware Detection

verfasst von: Zhangying He, Houman Homayoun, Hossein Sayadi

Erschienen in: Journal of Hardware and Systems Security

Einloggen, um Zugang zu erhalten

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Malware is increasingly becoming a significant threat to computing systems, and detecting zero-day (unknown) malware is crucial to ensure the security of modern systems. These attacks exploit software security vulnerabilities that are not documented or known in the detection mechanism’s database, making it particularly a pressing challenge to address. In recent times, there has been a shift in focus by security researchers toward the architecture of underlying processors. They have suggested implementing hardware-based malware detection (HMD) countermeasures to address the shortcomings of software-based detection methods. HMD techniques involve applying standard machine learning (ML) algorithms to low-level events of processors that are gathered from hardware performance counter (HPC) registers. While these techniques have shown promising results for detecting known malware, accurately recognizing zero-day malware remains an unsolved issue in the existing HPC-based detection methods. Our comprehensive analysis has revealed that standard ML classifiers are ineffective in identifying zero-day malware traces using HPC events. In response, we propose Deep-HMD, a multi-level intelligent and flexible approach based on deep neural network and transfer learning, for accurate zero-day malware detection using image-based hardware events. Deep-HMD first converts HPC-based malware and benign data into images, and subsequently employs a lightweight deep transfer learning methodology to obtain a high malware detection performance for both known and unknown test scenarios. To conduct a thorough analysis, three deep learning-based and nine standard ML algorithms are implemented and evaluated for hardware-based malware detection. The experimental results indicate that our proposed image-based malware detection solution achieves superior performance compared to all other methods, with a 97% detection performance (measured by F-measure and area under the curve) for run-time zero-day malware detection utilizing soley the top four performance counter events. Specifically, our novel approach outperforms the binarized MLP by 16% and the best classical ML algorithm by 18% in F-measure, while maintaining a minimal false positive rate and without incurring any hardware redesign overhead.

Das S, Werner J, Antonakakis M, Polychronakis M, Monrose F (2019) Sok: The challenges, pitfalls, and perils of using hardware performance counters for security. In: 2019 IEEE Symposium on Security and Privacy (SP), pp 20–38. https://doi.org/10.1109/SP.2019.00021CrossRef

Demme J, Maycock M, Schmitz J, Tang A, Waksman A, Sethumadhavan S, Stolfo S (2013) On the feasibility of online malware detection with performance counters. In: Proceedings of the 40th Annual International Symposium on Computer Architecture. ISCA ’13. Association for Computing Machinery, New York, pp 559–570. https://doi.org/10.1145/2485922.2485970CrossRef

Sayadi H, Patel N, Sai Manoj PD, Sasan A, Rafatirad S, Homayoun H (2018) Ensemble learning for effective run-time hardware-based malware detection: A comprehensive analysis and classification. In: 2018 55th ACM/ESDA/IEEE design automation conference (DAC), pp 1–6. https://doi.org/10.1109/DAC.2018.8465828CrossRef

Tang A, Sethumadhavan S, Stolfo SJ (2014) Unsupervised anomaly-based malware detection using hardware features. In: Stavrou A, Bos H, Portokalidis G (eds) Research in attacks, intrusions and defenses. Springer, Cham, pp 109–129CrossRef

He Z, Rezaei A, Homayoun H, Sayadi H (2022) Deep neural network and transfer learning for accurate hardware-based zero-day malware detection. In: Proceedings of the great lakes symposium on VLSI 2022. GLSVLSI ’22, pp 27–32. Association for Computing Machinery, New York. https://doi.org/10.1145/3526241.3530326CrossRef

Singh B, Evtyushkin D, Elwell J, Riley R, Cervesato I (2017) On the detection of kernel-level rootkits using hardware performance counters. In: Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ASIA CCS ’17. Association for Computing Machinery, New York, pp 483–493. https://doi.org/10.1145/3052973.3052999CrossRef

Ozsoy M, Donovick C, Gorelik I, AbuGhazaleh N, Ponomarev D (2015) Malware -aware processors: A framework for efficient online malware detection. In: 2015 IEEE 21st international symposium on high performance computer architecture (HPCA), pp 651–661. https://doi.org/10.1109/HPCA.2015.7056070CrossRef

Sayadi H, Makrani HM, Pudukotai Dinakarrao SM, Mohsenin T, Sasan A, Rafatirad S, Homayoun H (2019) 2smart: A two-stage machine learning-based approach for run-time specialized hardware-assisted malware detection. In: 2019 design, automation test in europe conference exhibition (DATE), pp 728–733. https://doi.org/10.23919/DATE.2019.8715080CrossRef

Krishnamurthy P, Karri R, Khorrami F (2020) Anomaly detection in real-time multithreaded processes using hardware performance counters. IEEE Trans Inf Forensics Secur 15:666–680. https://doi.org/10.1109/TIFS.2019.2923577CrossRef

10.

Basu K, Krishnamurthy P, Khorrami F, Karri R (2020) A theoretical study of hardware performance counters-based malware detection. IEEE Trans Inf Forensics Secur 15(512–525). https://doi.org/10.1109/TIFS.2019.2924549

11.

Sayadi H, Gao Y, Mohammadi Makrani H, Lin J, Costa PC, Rafatirad S, Homayoun H (2021) Towards accurate runtime hardware assisted stealthy malware detection: A lightweight, yet effective time series CNN-based approach. Cryptography 5(4). https://doi.org/10.3390/cryptography5040028

12.

Bilge L, Dumitras T (2012) Before we knew it: An empirical study of zero-day attacks in the real world. In: Proceedings of the 2012 ACM Conference on CCS. CCS ’12. ACM, New York, pp 833–844

13.

Comar PM, Liu L, Saha S, Tan P-N, Nucci A (2013) Combining supervised and unsupervised learning for zero-day malware detection. In: 2013 Proceedings IEEE INFO COM, pp 2022–2030. https://doi.org/10.1109/INFCOM.2013.6567003CrossRef

14.

Gandotra E, Bansal D, Sofat S (2016) Zero-day malware detection. In: 2016 sixth international symposium on embedded computing and system design (ISED), pp 171–175. https://doi.org/10.1109/ISED.2016.7977076CrossRef

15.

Kuruvila AP, Kundu S, Basu K (2020) Analyzing the efficiency of machine learning classifiers in hardware-based malware detectors. In: 2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp 452–457. https://doi.org/10.1109/ISVLSI49217.2020.00-15CrossRef

16.

Perf tools support for Intel Processor Trace. https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace. Accessed 1 Feb 2024

17.

Perf: Linux Profiling with Performance Counters (2017). https://perf.wiki.kernel.org/index.php

18.

Reddi VJ, Settle A, Connors DA, Cohn RS (2004) Pin: a binary instrumentation tool for computer architecture research and education. In: Proceedings of the 2004 workshop on computer architecture education: held in conjunction with the 31st international symposium on computer architecture, p 22

19.

Mucci PJ, Browne S, Deane C, Ho G (1999) Papi: A portable interface to hardware performance counters. In: Proceedings of the department of defense HPCMP users group conference, vol 710

20.

Reinders J (2005) VTune Performance analyzer essentials: measurement and tuning techniques for software developers. Intel Press, Engineer to Engineer Series

21.

Performance monitoring events - intel. https://perfmon-events.intel.com/. Accessed 1 May 2023

22.

Dementieve R, Willhalm T, Bruggeman O, Fay P, Ungerer P, Ott A, Lu P, Harris J, Kerly P, Konsor P, Semin A, Kanaly M, Brazones R, Shah R, Dobkins J (2022) Intel® performance counter monitor - a better way to measure CPU utilization. https://software.intel.com/content/www/us/en/develop/articles/intel-performance-counter-monitor.html. Accessed 1 May 2023

23.

Zhou B, Gupta A, Jahanshahi R, Egele M, Joshi A (2018) Hardware performance counters can detect malware: Myth or fact? In: Proceedings of the 2018 on Asia conference on computer and communications security. ASIACCS ’18. Association for Computing Machinery, New York, pp 457–468. https://doi.org/10.1145/3196494.3196515CrossRef

24.

Guthaus MR, Ringenberg JS, Ernst D, Austin TM, Mudge T, Brown RB (2001) Mibench: A free, commercially representative embedded benchmark suite. In: Proceedings of the fourth annual IEEE International Workshop on workload characterization. WWC-4 (Cat. No.01EX538), pp 3–14. https://doi.org/10.1109/WWC.2001.990739CrossRef

25.

Henning JL (2006) Spec cpu2006 benchmark descriptions. SIGARCH Comput. Archit. News 34(4):1–17CrossRef

26.

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12(null):2825–2830MathSciNet

27.

Kraskov A, Stögbauer H, Grassberger P (2003) Estimating mutual information. Physical review. E, Statistical, nonlinear and soft matter physics 69 6 Pt 2:066138

28.

Pandas: User Guide. https://pandas.pydata.org/docs/index.html. Accessed 1 May 2023

29.

McKinney (2010) Data structures for statistical computing in python. In: Walt M (ed) Proceedings of the 9th python in science conference, pp 56–61. https://doi.org/10.25080/Majora-92bf1922-00aCrossRef

30.

Raff E, Barker J, Sylvester J, Brandon R, Catanzaro B, Nicholas C (2017) Malware detection by eating a whole EXE

31.

Shukla S, Kolhe G, Sai Manoj P, Rafatirad S (2019) Work-in-progress: Microarchitectural events and image processing-based hybrid approach for robust malware detection. In: 2019 International Conference on Compliers, Architectures and Synthesis for Embedded Systems (CASES), pp 1–2

32.

Pektaş A, Acarman T (2020) Deep learning for effective android malware detection using API call graph embeddings. Soft Comput 24(2):1027–1043. https://doi.org/10.1007/s00500-019-03940-5CrossRef

33.

Kakisim AG, Gulmez S, Sogukpinar I (2022) Sequential opcode embedding-based malware detection method. Comput Electr Eng 98:107703. https://doi.org/10.1016/j.compeleceng.2022.107703CrossRef

34.

Kornblith S, Shlens J, Le QV (2019) Do better ImageNet models transfer better? In: 2019 IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 2656–2666. https://doi.org/10.1109/CVPR.2019.00277CrossRef

35.

Borisov V, Leemann T, Seßler K, Haug J, Pawelczyk M, Kasneci G (2022) Deep neural networks and tabular data: A survey. IEEE Trans Neural Netw Learn Syst 21:1. https://doi.org/10.1109/TNNLS.2022.3229161CrossRef

36.

Shwartz-Ziv R, Armon A (2022) Tabular data: Deep learning is not all you need. Inf Fusion 81(C):84–90. https://doi.org/10.1016/j.inffus.2021.11.011CrossRef

37.

Tan C, Sun F, Kong T, Zhang W, Yang C, Liu C (2018) A survey on deep transfer learning. In: Kurkova V, Manolopoulos Y, Hammer B, Iliadis L, Maglogiannis I (eds) Artificial neural networks and machine learning – ICANN 2018. Springer, Cham, pp 270–279CrossRef

38.

Sun B, Yang L, Zhang W, Lin M, Dong P, Young C, Dong J (2019) Supertml: Two-dimensional word embedding for the precognition on structured tabular data. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 2973–2981. https://doi.org/10.1109/CVPRW.2019.00360CrossRef

39.

Bradski G (2000) The OpenCV Library. Dr. Dobb’s Journal of Software Tools

40.

He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 770–778. https://doi.org/10.1109/CVPR.2016.90CrossRef

41.

Howard J et al (2021) fastai. GitHub. https://github.com/fastai/fastai. Accessed 10 Oct 2023

42.

Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Proceedings of the 27th international conference on neural information processing systems - Volume 2. NIPS’14, MIT Press, Cambridge, pp 3320–3328

43.

George D, Shen H, Huerta EA (2018) Classification and unsupervised clustering of LIGO data with deep transfer learning. Phys Rev D 97:101501. https://doi.org/10.1103/PhysRevD.97.101501ADSCrossRef

44.

Smith LN (2018) A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay

45.

Wang E, Davis JJ, Moro D, Zielinski P, Lim JJ, Coelho C, Chatterjee S, Cheung PYK, Constantinides GA (2023) Enabling binary neural network training on the edge. ACM Trans Embed Comput Syst 22(6). https://doi.org/10.1145/3626100

46.

Maaten L, Hinton G (2008) Visualizing data using t-sne. J Mach Learn Res 9(86):2579–2605

47.

Tran D, Liu JZ, Dusenberry MW, Phan D, Collier M, Ren JJ, Han K, Wang Z, Mariet ZE, Hu H, Band N, Rudner TGJ, Singhal K, Nado Z, Amersfoort JR, Kirsch A, Jenatton R, Thain N, Yuan H et al (2022) Plex: Towards reliability using pretrained large model extensions. ArXiv abs/2207.07411

48.

Lee K, Lee K, Lee H, Shin J (2018) A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In: Proceedings of the 32nd international conference on neural information processing systems. NIPS’18. Curran Associates Inc, Red Hook, pp 7167–7177

Titel: Guarding Against the Unknown: Deep Transfer Learning for Hardware Image-Based Malware Detection
verfasst von: Zhangying He
Houman Homayoun
Hossein Sayadi
Publikationsdatum: 15.03.2024
Verlag: Springer International Publishing
Erschienen in: Journal of Hardware and Systems Security
Print ISSN: 2509-3428
Elektronische ISSN: 2509-3436
DOI: https://doi.org/10.1007/s41635-024-00146-6

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.