Abstract
Advances in AI are extending the capabilities of tools for penetration testing. However, due to a fragmented market and rapid technical developments the extent of capabilities and maturity of available tools are not well understood. This short paper provides an overview of this area by reviewing recent academic literature and proposing several assessment criteria. The literature review identifies an active and growing research field, with numerous advancements in the past 18 months. A focus on LLM agents has progressed capabilities towards exploiting zero-day vulnerabilities. Frontier models show superior performance when combined with a focused knowledge base and multi-agent architectures. However, in most cases human involvement is still required, and fully autonomous solutions are not yet evident. To evaluate the maturity of tools, 12 assessment criteria within four broad categories are proposed: AI sophistication, action capabilities, features, and requirements. Maturity levels are proposed for each criterion which enables an objective benchmark of tool capabilities.
| Original language | English |
|---|---|
| Title of host publication | Computer Security. ESORICS 2025 International Workshops |
| Subtitle of host publication | ANUBIS 2025, SSECAI 2025, SecAssure 2025, STMUS 2025, Toulouse, France, September 22–24, 2025, Revised Selected Papers, Part II |
| Editors | Romain Laborde, Joaquin Garcia-Alfaro, Gregory Blanc, Pierre-François Gimenez, Harsha Kalutarage, Naoto Yanai, Ankur Shukla, Sandeep Pirbhulal, Joachim Posegga, Kwok-Yan Lam |
| Place of Publication | Cham |
| Publisher | Springer |
| Pages | 296-305 |
| Number of pages | 10 |
| Volume | Part II |
| ISBN (Electronic) | 9783032160928 |
| ISBN (Print) | 9783032160911 |
| DOIs | |
| Publication status | Published - 1 May 2026 |
| Event | Workshop on Security and Artificial Intelligence - Toulouse, France Duration: 25 Sept 2025 → 26 Sept 2025 https://sites.google.com/view/secai2025/home |
Publication series
| Name | Lecture Notes in Computer Science (LNCS) |
|---|---|
| Publisher | Springer |
| Volume | 16232 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Workshop
| Workshop | Workshop on Security and Artificial Intelligence |
|---|---|
| Abbreviated title | SECAI 2025 |
| Country/Territory | France |
| City | Toulouse |
| Period | 25/09/25 → 26/09/25 |
| Internet address |
Keywords
- AI
- Penetration testing
- Assessment criteria
Fingerprint
Dive into the research topics of 'Short paper: Evaluating the capabilities of AI-based penetration testing tools'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver