In recent years, large language models (LLMs) have made significant advances in their performance and in many areas are now achieving capabilities beyond those of the human brain. These models, including GPT-4, have the potential to be used in both value-added and malicious contexts. Academic research and practitioners have begun to investigate the potential of LLM agents to exploit cybersecurity vulnerabilities. However, previous studies were limited to simple vulnerabilities. The current paper “LLM Agents can Autonomously Exploit One-Day Vulnerabilities” by authors Richard Fang, Rohan Bindu, Akul Gupta and Daniel Kang investigates whether LLM agents can autonomously exploit real, complex vulnerabilities, in particular so-called one-day vulnerabilities, i.e. vulnerabilities that are known but not yet patched.
Computer security and LLM agents
Computer security is an extensive field of research that covers the protection of computer systems against unwanted activities by attackers. These undesirable actions include gaining root access to servers, executing arbitrary remote code and exfiltrating private data. Attackers use a variety of methods, from simple SQL injections to highly complex attacks that exploit multiple vulnerabilities. For example, the statistical reports of the Federal Office for Information Security regularly provide key figures and basic quantitative information on the state of cyber security in Germany.
LLM agents are specialized in using tools and reacting to their outputs. These agents can not only use tools, but also perform complex tasks such as planning actions, creating subagents and reading documents. The study shows that LLM agents, especially GPT-4, are significantly more powerful than previous models and are able to successfully exploit real vulnerabilities.
Benchmark for real vulnerabilities
In order to test the ability of LLM agents to exploit real computer systems, a benchmark consisting of real vulnerabilities was created. These vulnerabilities were taken from the Common Vulnerabilities and Exposures (CVE) database and from academic papers. CVE is a system operated by the US National Cybersecurity FFRDC (Federally funded research and development centers) and maintained by the Mitre Corporation for the standardized identification and naming of publicly known security vulnerabilities and other weaknesses in IT systems. The aim of the CVE system is to avoid multiple naming of the same vulnerabilities and weaknesses by different companies and institutions.
The selection focused on open-source software, as many vulnerabilities in closed-source software had to be excluded due to a lack of reproducibility. The benchmark created covers various categories of vulnerabilities, including websites, container management software and vulnerable Python packages.
The benchmark included a total of 15 vulnerabilities, some of which were classified as critical. A significant proportion of these vulnerabilities were after the GPT-4 cut-off date, which increased the challenge for the model.
Components of the LLM agent
The LLM agent developed in this study consists of several components:
Base model: The main model was GPT-4.
Prompt: A detailed prompt that encouraged the agent to be creative and try different approaches.
Agent framework: Implemented with the ReAct framework.
Various tools: The agent had access to various tools such as web browsing elements, a terminal, web search results, file creation and editing, and a code interpreter.
The result is both impressive and concerning from a risk management perspective: the agent was able to successfully exploit 87 percent of the vulnerabilities collected, highlighting the simplicity and efficiency of the LLM agent in exploiting vulnerabilities. In total, the agent consisted of 91 lines of code, including debugging and logging instructions.
GPT-4 showed the best performance
The analyses were carried out with various models and open source security scanners. GPT-4 showed the best performance with a success rate of 87 percent, while all other models tested (including GPT-3.5 and eight open-source models) as well as security scanners such as ZAP and Metasploit failed to exploit vulnerabilities. Without the vulnerability descriptions, GPT-4’s success rate dropped to 7 percent, underlining the importance of CVE descriptions.
Qualitative analysis showed that GPT-4 was able to exploit a variety of vulnerabilities, including those in websites, Python packages and container management software. The ability to detect and exploit vulnerabilities was significantly improved through the integration of tools and structured attack planning.
Cost evaluation for the use of GPT-4
The cost of using GPT-4 to exploit real-world vulnerabilities was also evaluated. The results showed that the number of actions performed by the agent to exploit a vulnerability varied by only 14 percent on average, regardless of whether the CVE description was present or not. This suggests that improved planning and the use of sub-agents could further improve performance.
Subagent system could improve performance
Detailed examination of the behavior of the GPT-4 agent revealed that many vulnerabilities required a large number of actions to be successfully exploited. For example, a vulnerability in a WordPress plugin required an average of 48.6 steps per attempt, with a successful attack involving as many as 100 steps. Another case combining CSRF and ACE attacks showed that without a CVE description, the agent had difficulties choosing the correct attack type. A subagent system could improve performance here.
Vulnerabilities without CVE descriptions
After the CVE descriptions were removed, the success rate dropped drastically to 7 percent. This shows that identifying vulnerabilities is significantly more difficult than exploiting them. Despite this challenge, GPT-4 was able to identify the correct vulnerability in 33.3 percent of cases, but only successfully exploited one of them. Examination of the number of actions performed showed that the differences were minimal, indicating the potential for improved planning and the use of subagents.
Conclusion and outlook
The results of this study show that LLM agents such as GPT-4 can autonomously exploit real, complex vulnerabilities. This represents both an opportunity and a risk . The widespread use of such technologies requires careful consideration and measures to ensure that they are not misused. The study emphasizes the need for further research to improve agent capabilities and develop secure systems.
The results underline the need to ensure the security of LLM agents and to carefully monitor their use in cybersecurity. The study provides highly relevant insights for the development of secure LLM systems and defense against potential cybersecurity threats.
Literature
Fang, R., Bindu, R., Gupta, A., Kang, D. (2024). LLM Agents can Autonomously Exploit One-Day Vulnerabilities. Preprint.