Top

Published in:

2024 | OriginalPaper | Chapter

Original Entry Point Detection Based on Graph Similarity

Authors : Thanh-Hung Pham, Mizuhito Ogawa

Published in: Foundations and Practice of Security

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

This paper proposes a method for packer identification and OEP (Original Entry Point) detection based on the graph similarity on control flow graphs of packed codes. Packed code consists of an unpacking stub and a packed payload, which is recovered to the original after the unpacking stub executes. In this paper, the CFGs of packed code are generated by a DSE (Dynamic Symbolic Execution) tool BE-PUM on x86-32/Windows. We define the template of the unpacking stub as the pair of the average of Weisfeiler-Lehman histogram vectors and the tail jump sequence. Next, each template is computed packer-wise (i.e., processing packed codes by the same packer) for the ease of covering a new packer. We use the total of 71 samples packed by 12 packers. For unknown packed code, we will find the templates in its CFG generated by BE-PUM.

Among them, the CFG fragment with the highest cosine similarity is regarded as the unpacking stub, which also detects the used packer and the OEP as the jump destination from the exit.

Our first experiment is performed on 700 non-malware samples (of which the original payload is also known) packed by 12 packers above. The used packer is correctly identified for 689 and the OEP is correctly detected for 688. Further, we apply the method to 1239 malware samples. Among them, 1089 samples are detected packed by unknown packer and among them 150 samples are detected as packed by the 11 packers (except for TELOCK) and their OEPs are detected. We conclude that our method is highly effective as long as we have access to an executable of a target packer to compute its templates.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter On Exploiting Symbolic Execution to Improve the Analysis of RAT Samples with angr

next chapter Attacking and Securing the Clock Randomization and Duplication Side-Channel Attack Countermeasure

King, J.C.: Symbolic execution and program testing. CACM 19, 385–394 (1976)MathSciNetCrossRef

Salwan, J., Bardin, S., Potet, M.-L.: Symbolic deobfuscation: from virtualized code back to the original. DIMVA, LNCS 10885, 372–392 (2018)

Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. FPS, LNCS 9482, 229–247 (2015)

Shervashidze, N., Schweitzer, P., van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: “Weisfeiler-Lehman Graph Kernels”. J. Mach. Learn. Res. 12, 2539–2561 (2011)

Wikipedia.“Cosine similarity.” https://en.wikipedia.org/wiki/Cosine_similarity

Royal, P., Halpin, M., Dagon, D., Edmonds, R., Lee, W.: PolyUnpack: automating the hidden-code extraction of upack-executing malware. In: ACSAC, pp. 289–300 (2006)

Martignoni, L., Christodorescu, M., Jha, S.: OmniUnpack: fast, generic, and safe unpacking of malware. In: ACSAC, pp. 431–441 (2007)

Kang, M., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: WORM 2007, pp. 46–53 (2007)

Isawa, R., Kamizono, M., Inoue, D.: Generic unpacking method based on detecting original entry point. NIP, LNCS 8226, 593–600 (2013)

10.

D’Alessio, S., Mariani, S.: PinDemonium: a DBI-based generic unpacker for windows executables. In: BlackHat, pp. 1–56 (2016)

11.

NtQuery. Scylla - x64/x86 imports reconstruction. https://github.com/NtQuery/Scylla

12.

Guo, F., Ferrie, P., Chiueh, T.C.: A study of the packer problem and its solutions. RAID, LNCS 5230, 98–115 (2008)

13.

Isawa, R., Inous, D., Nakao, K.: An original entry point detection method with candidate-sorting for more effective generic unpacking. IEICE Trans. E98-D(4), 883–893 (2015)

14.

Kim, G.M., Park, J., Jang, Y.H., Park, Y.: Efficient automatic original entry point detection. J. Inf. Sci. Eng. 35, 887–901 (2019)

15.

Jeong, G., Choo, E., Lee, J., Bat-Erdene, M., Lee, H.: Generic unpacking using entropy analysis. In: MALWARE, pp. 98–105 (2010)

16.

Phan, A.V., Nguyen, L.M., Nguyen, H.Y.L., Bui, L.T.: DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw. 108, 533–543 (2018)CrossRef

17.

Van Ouytsel, C.-H.B., Legay, A.: Malware analysis with symbolic execution and graph Kernel. NordSec, LNCS 13700, 292–310 (2022)

18.

Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 4:1–4:32 (2013)

19.

Nguyen, M.H., Ogawa, M., Tho, Q.T.: Packer identification based on metadata signature. In: SSPREW-7, pp. 1–11 (2017)

20.

Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. VMCAI, LNCS 5403, 214–228 (2009)MathSciNet

21.

Moura, L., Bjørner, N.: Z3: An efficient SMT solver. TACAS, LNCS 4963, 337–340 (2008)

22.

Knuth, D.E.: An empirical study of FORTRAN programs. Softw. Pract. Exp. 1(2), 105–134 (1971)CrossRef

23.

Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: ACM STOC, pp. 238–250 (1972)

Title: Original Entry Point Detection Based on Graph Similarity
Authors: Thanh-Hung Pham
Mizuhito Ogawa
Publisher: Springer Nature Switzerland
Book: Foundations and Practice of Security
Print ISBN: 978-3-031-57536-5

Electronic ISBN: 978-3-031-57537-2

Copyright Year: 2024
DOI: https://doi.org/10.1007/978-3-031-57537-2_22

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner