Skip to main content
Top

2024 | OriginalPaper | Chapter

Original Entry Point Detection Based on Graph Similarity

Authors : Thanh-Hung Pham, Mizuhito Ogawa

Published in: Foundations and Practice of Security

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

This paper proposes a method for packer identification and OEP (Original Entry Point) detection based on the graph similarity on control flow graphs of packed codes. Packed code consists of an unpacking stub and a packed payload, which is recovered to the original after the unpacking stub executes. In this paper, the CFGs of packed code are generated by a DSE (Dynamic Symbolic Execution) tool BE-PUM on x86-32/Windows. We define the template of the unpacking stub as the pair of the average of Weisfeiler-Lehman histogram vectors and the tail jump sequence. Next, each template is computed packer-wise (i.e., processing packed codes by the same packer) for the ease of covering a new packer. We use the total of 71 samples packed by 12 packers. For unknown packed code, we will find the templates in its CFG generated by BE-PUM.
Among them, the CFG fragment with the highest cosine similarity is regarded as the unpacking stub, which also detects the used packer and the OEP as the jump destination from the exit.
Our first experiment is performed on 700 non-malware samples (of which the original payload is also known) packed by 12 packers above. The used packer is correctly identified for 689 and the OEP is correctly detected for 688. Further, we apply the method to 1239 malware samples. Among them, 1089 samples are detected packed by unknown packer and among them 150 samples are detected as packed by the 11 packers (except for TELOCK) and their OEPs are detected. We conclude that our method is highly effective as long as we have access to an executable of a target packer to compute its templates.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Salwan, J., Bardin, S., Potet, M.-L.: Symbolic deobfuscation: from virtualized code back to the original. DIMVA, LNCS 10885, 372–392 (2018) Salwan, J., Bardin, S., Potet, M.-L.: Symbolic deobfuscation: from virtualized code back to the original. DIMVA, LNCS 10885, 372–392 (2018)
3.
go back to reference Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. FPS, LNCS 9482, 229–247 (2015) Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. FPS, LNCS 9482, 229–247 (2015)
4.
go back to reference Shervashidze, N., Schweitzer, P., van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: “Weisfeiler-Lehman Graph Kernels”. J. Mach. Learn. Res. 12, 2539–2561 (2011) Shervashidze, N., Schweitzer, P., van Leeuwen, E.J., Mehlhorn, K., Borgwardt, K.M.: “Weisfeiler-Lehman Graph Kernels”. J. Mach. Learn. Res. 12, 2539–2561 (2011)
6.
go back to reference Royal, P., Halpin, M., Dagon, D., Edmonds, R., Lee, W.: PolyUnpack: automating the hidden-code extraction of upack-executing malware. In: ACSAC, pp. 289–300 (2006) Royal, P., Halpin, M., Dagon, D., Edmonds, R., Lee, W.: PolyUnpack: automating the hidden-code extraction of upack-executing malware. In: ACSAC, pp. 289–300 (2006)
7.
go back to reference Martignoni, L., Christodorescu, M., Jha, S.: OmniUnpack: fast, generic, and safe unpacking of malware. In: ACSAC, pp. 431–441 (2007) Martignoni, L., Christodorescu, M., Jha, S.: OmniUnpack: fast, generic, and safe unpacking of malware. In: ACSAC, pp. 431–441 (2007)
8.
go back to reference Kang, M., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: WORM 2007, pp. 46–53 (2007) Kang, M., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: WORM 2007, pp. 46–53 (2007)
9.
go back to reference Isawa, R., Kamizono, M., Inoue, D.: Generic unpacking method based on detecting original entry point. NIP, LNCS 8226, 593–600 (2013) Isawa, R., Kamizono, M., Inoue, D.: Generic unpacking method based on detecting original entry point. NIP, LNCS 8226, 593–600 (2013)
10.
go back to reference D’Alessio, S., Mariani, S.: PinDemonium: a DBI-based generic unpacker for windows executables. In: BlackHat, pp. 1–56 (2016) D’Alessio, S., Mariani, S.: PinDemonium: a DBI-based generic unpacker for windows executables. In: BlackHat, pp. 1–56 (2016)
12.
go back to reference Guo, F., Ferrie, P., Chiueh, T.C.: A study of the packer problem and its solutions. RAID, LNCS 5230, 98–115 (2008) Guo, F., Ferrie, P., Chiueh, T.C.: A study of the packer problem and its solutions. RAID, LNCS 5230, 98–115 (2008)
13.
go back to reference Isawa, R., Inous, D., Nakao, K.: An original entry point detection method with candidate-sorting for more effective generic unpacking. IEICE Trans. E98-D(4), 883–893 (2015) Isawa, R., Inous, D., Nakao, K.: An original entry point detection method with candidate-sorting for more effective generic unpacking. IEICE Trans. E98-D(4), 883–893 (2015)
14.
go back to reference Kim, G.M., Park, J., Jang, Y.H., Park, Y.: Efficient automatic original entry point detection. J. Inf. Sci. Eng. 35, 887–901 (2019) Kim, G.M., Park, J., Jang, Y.H., Park, Y.: Efficient automatic original entry point detection. J. Inf. Sci. Eng. 35, 887–901 (2019)
15.
go back to reference Jeong, G., Choo, E., Lee, J., Bat-Erdene, M., Lee, H.: Generic unpacking using entropy analysis. In: MALWARE, pp. 98–105 (2010) Jeong, G., Choo, E., Lee, J., Bat-Erdene, M., Lee, H.: Generic unpacking using entropy analysis. In: MALWARE, pp. 98–105 (2010)
16.
go back to reference Phan, A.V., Nguyen, L.M., Nguyen, H.Y.L., Bui, L.T.: DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw. 108, 533–543 (2018)CrossRef Phan, A.V., Nguyen, L.M., Nguyen, H.Y.L., Bui, L.T.: DGCNN: a convolutional neural network over large-scale labeled graphs. Neural Netw. 108, 533–543 (2018)CrossRef
17.
go back to reference Van Ouytsel, C.-H.B., Legay, A.: Malware analysis with symbolic execution and graph Kernel. NordSec, LNCS 13700, 292–310 (2022) Van Ouytsel, C.-H.B., Legay, A.: Malware analysis with symbolic execution and graph Kernel. NordSec, LNCS 13700, 292–310 (2022)
18.
go back to reference Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 4:1–4:32 (2013) Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 4:1–4:32 (2013)
19.
go back to reference Nguyen, M.H., Ogawa, M., Tho, Q.T.: Packer identification based on metadata signature. In: SSPREW-7, pp. 1–11 (2017) Nguyen, M.H., Ogawa, M., Tho, Q.T.: Packer identification based on metadata signature. In: SSPREW-7, pp. 1–11 (2017)
20.
go back to reference Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. VMCAI, LNCS 5403, 214–228 (2009)MathSciNet Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. VMCAI, LNCS 5403, 214–228 (2009)MathSciNet
21.
go back to reference Moura, L., Bjørner, N.: Z3: An efficient SMT solver. TACAS, LNCS 4963, 337–340 (2008) Moura, L., Bjørner, N.: Z3: An efficient SMT solver. TACAS, LNCS 4963, 337–340 (2008)
22.
go back to reference Knuth, D.E.: An empirical study of FORTRAN programs. Softw. Pract. Exp. 1(2), 105–134 (1971)CrossRef Knuth, D.E.: An empirical study of FORTRAN programs. Softw. Pract. Exp. 1(2), 105–134 (1971)CrossRef
23.
go back to reference Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: ACM STOC, pp. 238–250 (1972) Hecht, M.S., Ullman, J.D.: Flow graph reducibility. In: ACM STOC, pp. 238–250 (1972)
Metadata
Title
Original Entry Point Detection Based on Graph Similarity
Authors
Thanh-Hung Pham
Mizuhito Ogawa
Copyright Year
2024
DOI
https://doi.org/10.1007/978-3-031-57537-2_22

Premium Partner