SAST Implementation for Evaluating LLM-Generated Code Quality using Prompt Engineering
Abstract
Keywords
Full Text:
PDFReferences
S. Baltes et al., “Guidelines for Empirical Studies in Software Engineering involving Large Language Models,” Mar. 02, 2026, arXiv: arXiv:2508.15503. doi: 10.48550/arXiv.2508.15503.
DORA Team, “State of AI-Assisted Software Development,” Google Cloud, Research Report, 2025. [Online]. Available: https://services.google.com/fh/files/misc/2025_state_of_ai_assisted_software_development.pdf
R. Pandey, P. Singh, R. Wei, and S. Shankar, “Transforming Software Development: Evaluating the Efficiency and Challenges of GitHub Copilot in Real-World Projects,” Jun. 25, 2024, arXiv: arXiv:2406.17910. DOI: 10.48550/arXiv.2406.17910.
I. D. Fagadau, L. Mariani, D. Micucci, and O. Riganelli, “Analyzing Prompt Influence on Automated Method Generation: An Empirical Study with Copilot,” in Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension, Apr. 2024, pp. 24–34. DOI: 10.1145/3643916.3644409.
S. Gao et al., “The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation,” Jan. 02, 2025, arXiv: arXiv:2501.01329. DOI: 10.48550/arXiv.2501.01329.
Y. Fu et al., “Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study,” Feb. 06, 2025, arXiv: arXiv:2310.02059. DOI: 10.48550/arXiv.2310.02059.
“How We Ensure Less than 5% False Positive Rate • DeepSource,” DeepSource. [Online]. Available: https://deepsource.com/blog/how-deepsource-ensures-less-false-positives
A. S. Ami, K. Moran, D. Poshyvanyk, and A. Nadkarni, “‘False Negative -- that One is Going to Kill You’: Understanding Industry Perspectives of Static Analysis based Security Testing,” in 2024 IEEE Symposium on Security and Privacy (SP), May 2024, pp. 3979–3997. DOI: 10.1109/SP54263.2024.00019.
W. Peng, X. Wang, and Q. Wu, “ProxyWar: Dynamic Assessment of LLM Code Generation in Game Arenas,” Feb. 04, 2026, arXiv: arXiv:2602.04296. DOI: 10.48550/arXiv.2602.04296.
S. Schulhoff et al., “The Prompt Report: A Systematic Survey of Prompt Engineering Techniques,” Feb. 2025, DOI: 10.48550/arXiv.2406.06608.
P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing,” Jul. 28, 2021, arXiv: arXiv:2107.13586. DOI: 10.48550/arXiv.2107.13586.
H. Guo, “An Empirical Study of Prompt Mode in Code Generation based on ChatGPT,” Appl. Comput. Eng., Vol. 73, No. 1, pp. 69–76, Jul. 2024, DOI: 10.54254/2755-2721/73/20240367.
S. Anasuri, “Prompt Engineering Best Practices for Code Generation Tools,” Int. J. Emerg. Trends Comput. Sci. Inf. Technol., Vol. 5, No. 1, pp. 69–81, Mar. 2024, DOI: 10.63282/3050-9246.IJETCSIT-V5I1P108.
J. Wei et al., “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” Jan. 10, 2023, arXiv: arXiv:2201.11903. DOI: 10.48550/arXiv.2201.11903.
H. Louatouate and M. Zeriouh, “Role-based Prompting Technique in Generative AI-Assisted Learning: A Student-Centered Quasi-Experimental Study,” J. Comput. SCI. Technol. Stud., Vol. 7, No. 2, pp. 130–145, Apr. 2025, DOI: 10.32996/jcsts.2025.7.2.12.
E. Basic and A. Giaretta, “From Vulnerabilities to Remediation: A Systematic Literature Review of LLMs in Code Security,” Apr. 14, 2025, arXiv: arXiv:2412.15004. DOI: 10.48550/arXiv.2412.15004.
A. Sabra, O. Schmitt, and J. Tyler, “Assessing the Quality and Security of AI-Generated Code: A Quantitative Analysis,” Aug. 20, 2025, arXiv: arXiv:2508.14727. DOI: 10.48550/arXiv.2508.14727.
Y. Liu, R. Widyasari, Y. Zhao, I. C. Irsan, J. Chen, and D. Lo, “Debt Behind the AI Boom: A Large-Scale Empirical Study of AI-Generated Code in the Wild,” 2026, arXiv. doi: 10.48550/ARXIV.2603.28592.
T. Hardiani, D. Wijayanto, and N. Latifah, “Data Security Analysis with OWASP Framework on Website XYZ,” CYBERNETICS, Vol. 6, No. 01, p. 10, Jul. 2022, DOI: 10.29406/cbn.v6i01.3953.
M. R. Syam Al’Am’yubi and D. Wijayanto, “Analisis Sistem Keamanan Website XYZ menggunakan Framework OWASP ZAP,” J. Ilmu Komput. JUIK, Vol. 3, No. 1, p. 1, Mar. 2023, DOI: 10.31314/juik.v3i1.1974.
D. Wijayanto and A. Firdonsyah, “Analisis Tingkat Resiko pada Website XYZ menggunakan Metode OWASP,” Digit. Transform. Technol., Vol. 4, No. 1, pp. 644–651, Aug. 2024, DOI: 10.47709/digitech.v4i1.4485.
M. Esposito, V. Falaschi, and D. Falessi, “An Extensive Comparison of Static Application Security Testing Tools,” Mar. 14, 2024, arXiv: arXiv:2403.09219. DOI: 10.48550/arXiv.2403.09219.
D. Tosi, “Studying the Quality of Source Code Generated by Different AI Generative Engines: An Empirical Evaluation,” Future Internet, Vol. 16, No. 6, p. 188, May 2024, DOI: 10.3390/fi16060188.
“GPT-5 Benchmarks and Analysis.” [Online]. Available: https://artificialanalysis.ai/articles/gpt-5-benchmarks-and-analysis
“GPT-5 High Reasoning Evaluation: A Major Leap in Coding Performance,” 16x Eval. [Online]. Available: https://eval.16x.engineer/blog/gpt-5-high-reasoning-coding-performance-evaluation
DOI: https://doi.org/10.32520/stmsi.v15i5.6395
Article Metrics
Abstract view : 0 timesPDF - 0 times
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.







