A COMPARATIVE STUDY OF OPTIMIZATION  TECHNIQUES FOR LARGE LANGUAGE MODELS: THE  CASE OF AUTOMATED THREAT ANALYSIS IN OPEN RAN SECURITY

Pflaum, Dominik

A COMPARATIVE STUDY OF OPTIMIZATION TECHNIQUES FOR LARGE LANGUAGE MODELS: THE CASE OF AUTOMATED THREAT ANALYSIS IN OPEN RAN SECURITY

Pflaum, Dominik (2025-11-04)

A COMPARATIVE STUDY OF OPTIMIZATION TECHNIQUES FOR LARGE LANGUAGE MODELS: THE CASE OF AUTOMATED THREAT ANALYSIS IN OPEN RAN SECURITY

Pflaum, Dominik

(04.11.2025)

Katso/Avaa

Dominik_Pflaum_Thesis.pdf (1.267Mb)

Lataukset:

Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.

suljettu

Näytä kaikki kuvailutiedot

Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe20251210117065

Tiivistelmä

Among the hype for Large Language Models (LLMs), their practical application in specialized domains, for instance within companies, requires careful optimization. This study systematically compares optimization strategies for LLMs to determine their effectiveness and trade-offs. To ground this comparison, the study uses the automated threat-to-Common Attack Pattern Enumeration and Classification (CAPEC) mapping task within the Open RAN Continuous Security Analysis (ORCA) pipeline as a real-world testbed. Following a Design Science Research (DSR) paradigm, this study develops and evaluates an innovative artifact: an LLM- based threat-to-CAPEC mapper. The implementation systematically compares the performance of different LLM optimization techniques, primarily prompt engineering (zero-shot, few-shot, chain-of-thought) and Retrieval-Augmented Generation (RAG). Experiments were conducted using various sizes of locally-run, open-source LLMs and benchmarked against the sentence-transformer model in the ORCA pipeline. The evaluation focuses on factual validity, operational performance, and a comparative analysis of the resulting mappings, given the absence of a ground-truth dataset. The research reveals that the effectiveness of LLMs is critically dependent on the optimization technique. Retrieval-Augmented Generation (RAG) was the only technique to produce consistently valid and reasonable results by grounding the LLM with an external knowledge base. The RAG-optimized LLM mappings showed significant divergence from the sentence transformer model, indicating a fundamentally different, reasoning-based analysis. The most significant finding is the RAG-optimized model’s ability to provide a natural language rationale for its decisions, transforming the output from an opaque suggestion into a transparent and auditable conclusion. However, the challenges regarding achieving determinism limit the operational reliability of the artifact in an automated pipeline. The primary practical implication is the creation of a component for an artificial intelligence (AI)- augmented workflow that enhances, rather than replaces, the human security expert. By providing an explainable, data-driven starting point, the artifact directly addresses the problem of subjectivity in threat analysis and improves the consistency and auditability of the process. This thesis provides a tangible case study for responsibly integrating reasoning-based AI into security-critical systems, demonstrating that factual grounding via RAG is essential for unlocking the practical utility of LLMs in specialized domains like Open Radio Access Network (O-RAN) security. Challenges remain, particularly regarding an operationally reliable deployment where repetition is critical.

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt sekä syventävien opintojen opinnäytetyöt (rajattu näkyvyys) [5446]