Evaluation of LLM Agents for the SOC Tier 1 Analyst Triage Process
Oniagbi, Openime (2024-06-20)
Evaluation of LLM Agents for the SOC Tier 1 Analyst Triage Process
Oniagbi, Openime
(20.06.2024)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2024062457864
https://urn.fi/URN:NBN:fi-fe2024062457864
Tiivistelmä
Cyber attacks are continuously growing in volume and complexity, which requires developing efficient and effective solutions that enable Security Operations Center (SOC) environments to detect and mitigate these threats. This thesis focuses on designing and evaluating the application of agents powered by Large Language Models (LLM) in the triage process of Tier 1 SOC analysts. This research evaluates 5 LLMs: OpenAI's GPT-4o and GPT-3.5, Meta Llama 3, Mixtral 8x22B, and OpenHermes 2.5 Mistral 7B. The study investigates LLM agents' ability to label different alerts as either 'interesting' or 'not-interesting.'
The research proposes a framework consisting of an alert generation and processing module, the LLM agent, and a reporting module. An experiment uses an open-source SIEM, Wazuh, with three endpoints connected to test this framework. 'Not-interesting' alerts were generated from normal user operations, with 'interesting' alerts from adversary emulation using CALDERA. The results show that LLM agents can significantly improve the speed of initial triage and reduce the cognitive load on human analysts by automating routine tasks and providing natural language explanations for alerts. However, challenges like occasional hallucinations, integration complexity, and privacy risks are identified. The research concludes that while LLM agents show promise for use in the triage process in Tier 1 SOC operations, a hybrid approach that uses the LLM agents as co-pilots is essential to maximize benefits and mitigate risks.
This research contributes to the growing body of knowledge on artificial intelligence in cybersecurity, providing practical insights for organizations considering integrating LLMs into their SOC workflows. Recommendations for future work include exploring testing and training LLMs on a large dataset of alerts and investigating prompt optimization techniques to get the best results from LLM agents.
The research proposes a framework consisting of an alert generation and processing module, the LLM agent, and a reporting module. An experiment uses an open-source SIEM, Wazuh, with three endpoints connected to test this framework. 'Not-interesting' alerts were generated from normal user operations, with 'interesting' alerts from adversary emulation using CALDERA. The results show that LLM agents can significantly improve the speed of initial triage and reduce the cognitive load on human analysts by automating routine tasks and providing natural language explanations for alerts. However, challenges like occasional hallucinations, integration complexity, and privacy risks are identified. The research concludes that while LLM agents show promise for use in the triage process in Tier 1 SOC operations, a hybrid approach that uses the LLM agents as co-pilots is essential to maximize benefits and mitigate risks.
This research contributes to the growing body of knowledge on artificial intelligence in cybersecurity, providing practical insights for organizations considering integrating LLMs into their SOC workflows. Recommendations for future work include exploring testing and training LLMs on a large dataset of alerts and investigating prompt optimization techniques to get the best results from LLM agents.