Security Analysis of LLM-Generated Web API Backends

Khan, Abdul Ali

Security Analysis of LLM-Generated Web API Backends

dc.contributor.author	Khan, Abdul Ali
dc.contributor.department	fi=Tietotekniikan laitos\|en=Department of Computing\|
dc.contributor.faculty	fi=Teknillinen tiedekunta\|en=Faculty of Technology\|
dc.contributor.studysubject	fi=Tietotekniikka\|en=Information and Communication Technology\|
dc.date.accessioned	2026-04-29T22:47:15Z
dc.date.issued	2026-03-26
dc.description.abstract	The adoption of Large Language Models (LLMs) in software engineering is changing how code is written, but the security implications for complex systems remain unclear. Previous research has primarily evaluated the security of LLM-generated code on isolated code snippets. However, this narrow scope cannot capture the security risks that emerge in integrated web API backends. To address this, we designed a benchmarking framework derived from a data-driven triangulation of Stack Overflow discussions, GitHub implementations and OWASP security risks. This yielded five representative tasks: Authentication, Role-Based Access Control, File Uploads, Payment Processing and Webhook Handling. We evaluated three state-of-the-art models (GPT-5.2, DeepSeek V3.2 and Gemini 2.5 Pro) using a multi-layered assessment methodology combining static application security testing (SAST), dynamic application security testing (DAST), and manual penetration testing. Scanning the generated APIs for vulnerabilities with SAST tools mainly revealed configuration-level issues. However, upon conducting manual penetration testing, we identified mass assignment exposures, insecure execution ordering, and server-side request forgery vectors. The outputs from the LLMs also pointed towards a disconnect between functional correctness and secure logic. The model that demonstrated high build success (DeepSeek V3.2) produced the most vulnerable code in our trials, introducing severe logic flaws, such as broken object-level authorization (BOLA). On the contrary, the model that struggled most with syntax (Gemini 2.5 Pro) defaulted to safer but less functional implementations. This thesis formally terms this pattern as the human-in-the-loop paradox, in which syntactically sound and well-structured code generated by an LLM may conceal deep architectural vulnerabilities that are not revealed by build success or surface-level inspection. These findings indicate that relying on build success or static analysis alone may create a misleading illusion of correctness. Based on the findings from the literature review and security analysis experiments, the thesis presents recommendations to assist individuals and organizations in utilizing LLM-generated API backends more effectively. We suggest that code produced by current LLMs should not be treated as a trusted draft, but rather as untrusted input from an external system, much like user input in a web form.
dc.format.extent	144
dc.identifier.uri	https://www.utupub.fi/handle/11111/60085
dc.identifier.urn	URN:NBN:fi-fe2026041628161
dc.language.iso	eng
dc.rights	fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.\|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.\|
dc.rights.accessrights	avoin
dc.subject	large language models
dc.subject	software security
dc.subject	web development
dc.subject	security vulnerabilities
dc.subject	static analysis
dc.subject	dynamic analysis
dc.subject	systematic literature review
dc.title	Security Analysis of LLM-Generated Web API Backends
dc.type.ontasot	fi=Diplomityö\|en=Master's thesis\|

Tiedostot

Näytetään 1 - 1 / 1

Name:: Khan_AbdulAli_Thesis.pdf
Size:: 1.05 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt sekä syventävien opintojen opinnäytetyöt (kokotekstit)