Skip to main content

2024 | OriginalPaper | Buchkapitel

Enhancing Code Security Through Open-Source Large Language Models: A Comparative Study

verfasst von : Norah Ridley, Enrico Branca, Jadyn Kimber, Natalia Stakhanova

Erschienen in: Foundations and Practice of Security

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Significant advances in the language processing field are providing new innovations, including the ability to analyze code for weaknesses. Typically, analyzing code security is performed by tools that use known vulnerable patterns, which may not adequately represent the intricacies of vulnerabilities in real-world projects. Such tools can fail to detect non-standard weaknesses in code samples, potentially leading to a loss of personal and financial information for end users of the code. Using language-based models to detect weaknesses that would have otherwise been missed by the currently available analysis tools is a promising new avenue of vulnerability detection. In this research, we employ 25 different models to evaluate the security of code samples. Using an existing dataset of insecure code, we prompt each model to detect weaknesses in the vulnerable code. Our findings indicate that most models are ill-equipped to deal with insecure code. Through our analysis, we identify strategies for improving weakness detection using language models.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
At the time of writing.
 
Literatur
1.
Zurück zum Zitat Adamson, V., Bägerfeldt, J.: Assessing the effectiveness of ChatGPT in generating Python code (2023) Adamson, V., Bägerfeldt, J.: Assessing the effectiveness of ChatGPT in generating Python code (2023)
2.
Zurück zum Zitat Ahmed, I., Kajol, M., Hasan, U., Datta, P.P., Roy, A., Reza, M.R.: ChatGPT vs. Bard: a comparative study. UMBC Student Collection (2023) Ahmed, I., Kajol, M., Hasan, U., Datta, P.P., Roy, A., Reza, M.R.: ChatGPT vs. Bard: a comparative study. UMBC Student Collection (2023)
5.
Zurück zum Zitat Bilgin, Z., Ersoy, M.A., Soykan, E.U., Tomur, E., Çomak, P., Karaçay, L.: Vulnerability prediction from source code using machine learning. IEEE Access 8, 150672–150684 (2020)CrossRef Bilgin, Z., Ersoy, M.A., Soykan, E.U., Tomur, E., Çomak, P., Karaçay, L.: Vulnerability prediction from source code using machine learning. IEEE Access 8, 150672–150684 (2020)CrossRef
6.
Zurück zum Zitat Bull, C., Kharrufa, A.: Generative AI assistants in software development education: a vision for integrating generative AI into educational practice, not instinctively defending against it. IEEE Softw. 41, 52–59 (2023)CrossRef Bull, C., Kharrufa, A.: Generative AI assistants in software development education: a vision for integrating generative AI into educational practice, not instinctively defending against it. IEEE Softw. 41, 52–59 (2023)CrossRef
20.
Zurück zum Zitat Kande, R., et al.: LLM-assisted generation of hardware assertions (2023) Kande, R., et al.: LLM-assisted generation of hardware assertions (2023)
21.
Zurück zum Zitat Khoury, R., Avila, A.R., Brunelle, J., Camara, B.M.: How secure is code generated by chatgpt? arXiv preprint arXiv:2304.09655 (2023) Khoury, R., Avila, A.R., Brunelle, J., Camara, B.M.: How secure is code generated by chatgpt? arXiv preprint arXiv:​2304.​09655 (2023)
23.
Zurück zum Zitat Lee, A.N., Hunter, C.J., Ruiz, N.: Platypus: quick, cheap, and powerful refinement of LLMs (2023) Lee, A.N., Hunter, C.J., Ruiz, N.: Platypus: quick, cheap, and powerful refinement of LLMs (2023)
24.
Zurück zum Zitat Li, R., et al.: StarCoder: may the source be with you! (2023) Li, R., et al.: StarCoder: may the source be with you! (2023)
27.
Zurück zum Zitat Nayak, A., Timmapathini, H.P.: LLM2KB: constructing knowledge bases using instruction tuned context aware large language models. arXiv preprint arXiv:2308.13207 (2023) Nayak, A., Timmapathini, H.P.: LLM2KB: constructing knowledge bases using instruction tuned context aware large language models. arXiv preprint arXiv:​2308.​13207 (2023)
29.
Zurück zum Zitat Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., Karri, R.: Asleep at the keyboard? Assessing the security of GitHub Copilot’s code contributions. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 754–768 (2022) Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., Karri, R.: Asleep at the keyboard? Assessing the security of GitHub Copilot’s code contributions. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 754–768 (2022)
30.
Zurück zum Zitat Pearce, H., Tan, B., Ahmad, B., Karri, R., Dolan-Gavitt, B.: Examining zero-shot vulnerability repair with large language models. In: 2023 IEEE Symposium on Security and Privacy (SP), pp. 2339–2356. IEEE (2023) Pearce, H., Tan, B., Ahmad, B., Karri, R., Dolan-Gavitt, B.: Examining zero-shot vulnerability repair with large language models. In: 2023 IEEE Symposium on Security and Privacy (SP), pp. 2339–2356. IEEE (2023)
33.
Zurück zum Zitat Sandoval, G., Pearce, H., Nys, T., Karri, R., Garg, S., Dolan-Gavitt, B.: Lost at C: a user study on the security implications of large language model code assistants. In: USENIX (2023) Sandoval, G., Pearce, H., Nys, T., Karri, R., Garg, S., Dolan-Gavitt, B.: Lost at C: a user study on the security implications of large language model code assistants. In: USENIX (2023)
34.
Zurück zum Zitat Sharma, S., Sodhi, B.: Calculating originality of LLM assisted source code (2023) Sharma, S., Sodhi, B.: Calculating originality of LLM assisted source code (2023)
35.
Zurück zum Zitat Siddiq, M.L., Santos, J.C.S.: SecurityEval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques. In: Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security, MSR4PS 2022 (2022). https://doi.org/10.1145/3549035.3561184 Siddiq, M.L., Santos, J.C.S.: SecurityEval dataset: mining vulnerability examples to evaluate machine learning-based code generation techniques. In: Proceedings of the 1st International Workshop on Mining Software Repositories Applications for Privacy and Security, MSR4PS 2022 (2022). https://​doi.​org/​10.​1145/​3549035.​3561184
37.
Zurück zum Zitat Surameery, N.M.S., Shakor, M.Y.: Use ChatGPT to solve programming bugs. Int. J. Inf. Technol. Comput. Eng. (IJITC) 3(01), 17–22 (2023). ISSN 2455-5290 Surameery, N.M.S., Shakor, M.Y.: Use ChatGPT to solve programming bugs. Int. J. Inf. Technol. Comput. Eng. (IJITC) 3(01), 17–22 (2023). ISSN 2455-5290
38.
Zurück zum Zitat Taecharungroj, V.: “What can ChatGPT do?’’ Analyzing early reactions to the innovative AI Chatbot on Twitter. Big Data Cogn. Comput. 7(1), 35 (2023)CrossRef Taecharungroj, V.: “What can ChatGPT do?’’ Analyzing early reactions to the innovative AI Chatbot on Twitter. Big Data Cogn. Comput. 7(1), 35 (2023)CrossRef
40.
Zurück zum Zitat Yamaguchi, F., Rieck, K., et al.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: 5th USENIX Workshop on Offensive Technologies, WOOT 2011 (2011) Yamaguchi, F., Rieck, K., et al.: Vulnerability extrapolation: assisted discovery of vulnerabilities using machine learning. In: 5th USENIX Workshop on Offensive Technologies, WOOT 2011 (2011)
41.
Zurück zum Zitat Yetiştiren, B., Özsoy, I., Ayerdem, M., Tüzün, E.: Evaluating the code quality of AI-assisted code generation tools: an empirical study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT (2023) Yetiştiren, B., Özsoy, I., Ayerdem, M., Tüzün, E.: Evaluating the code quality of AI-assisted code generation tools: an empirical study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT (2023)
Metadaten
Titel
Enhancing Code Security Through Open-Source Large Language Models: A Comparative Study
verfasst von
Norah Ridley
Enrico Branca
Jadyn Kimber
Natalia Stakhanova
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-57537-2_15

Premium Partner