Enhancing Explainability and Performance in Intrusion Detection Systems using Deep Learning Models and LLMs
Ahmed, Mohd (2025-07-31)
Enhancing Explainability and Performance in Intrusion Detection Systems using Deep Learning Models and LLMs
Ahmed, Mohd
(31.07.2025)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2025080881613
https://urn.fi/URN:NBN:fi-fe2025080881613
Tiivistelmä
The evolving landscape of cyber security highlights the importance of the effectiveness and transparency of Intrusion Detection Systems (IDS), which play a critical role in protecting computer networks from malicious activities. However, many advanced Machine Learning (ML) models used in IDS often present challenges in interpretation, thereby constraining their reliability and practical deployment. This research aims to improve the detection performance and explainability of IDS by combining powerful tabular Deep Learning (DL) models with open-source Large Language Models (LLMs).
In this study, the CSE-CIC-IDS2018 dataset is used as a benchmark for training and evaluating several ML models. The models include TabNet, a DL model specifically designed for tabular data, and various AutoGluon-based models such as Neural Network implemented with PyTorch (NN_TORCH), Gradient Boosting Machine (GBM), Categorical Boosting (CaTBoost), and Extreme Gradient Boosting (XGBoost). These models are evaluated based on their ability to detect different kinds of network intrusions reliably.
After making predictions, the outputs obtained from the models are passed to open-source LLMs which generate natural language explanations. This step is intended to make the decision-making process of the models more understandable to human, including security analysts and system administrators. By integrating LLMs into the IDS pipeline, the system not only effectively identifies threats but also provides an explanation of the reasons behind each prediction in a human-readable format.
The experimental findings indicate that the suggested method delivers a high detection accuracy. AutoGluon ensembles attained up to 98.1% accuracy, while TabNet achieved 97.8%. Additionally, this approach provides clear and beneficial explanations through LLMs. Although the performance wasn’t consistent across all minor attack classes, integrating DL with LLMs significantly increased the system’s transparency and utility for analysts. This improvement in understandability enhances the system’s practical applicability in cyber security contexts.
In this study, the CSE-CIC-IDS2018 dataset is used as a benchmark for training and evaluating several ML models. The models include TabNet, a DL model specifically designed for tabular data, and various AutoGluon-based models such as Neural Network implemented with PyTorch (NN_TORCH), Gradient Boosting Machine (GBM), Categorical Boosting (CaTBoost), and Extreme Gradient Boosting (XGBoost). These models are evaluated based on their ability to detect different kinds of network intrusions reliably.
After making predictions, the outputs obtained from the models are passed to open-source LLMs which generate natural language explanations. This step is intended to make the decision-making process of the models more understandable to human, including security analysts and system administrators. By integrating LLMs into the IDS pipeline, the system not only effectively identifies threats but also provides an explanation of the reasons behind each prediction in a human-readable format.
The experimental findings indicate that the suggested method delivers a high detection accuracy. AutoGluon ensembles attained up to 98.1% accuracy, while TabNet achieved 97.8%. Additionally, this approach provides clear and beneficial explanations through LLMs. Although the performance wasn’t consistent across all minor attack classes, integrating DL with LLMs significantly increased the system’s transparency and utility for analysts. This improvement in understandability enhances the system’s practical applicability in cyber security contexts.