On the Cybersecurity of AI Agents

Dmitry Namiot; Eugene Ilyushin

On the Cybersecurity of AI Agents

Dmitry Namiot, Eugene Ilyushin

Abstract

AI agents (agents with artificial intelligence), by the most general definition, are some autonomously operating systems that use artificial intelligence methods to achieve their goals. They can be described as decision automation tools using artificial intelligence methods. The development of this direction owes its popularity to the rise of large language models. According to one of the most commonly used patterns, an agent application is a coordinator (organizer) for executing user requests using LLM. Large language models are subject to adversarial attacks, the number of risks for generative models is in the hundreds, accordingly, when using aggregative solutions, cybersecurity problems can only intensify. The solutions proposed today can only be considered as a move towards reducing risks, without guarantees of a complete solution to the problems. This article is the first in a series of works devoted to the security of AI agents.

Full Text:

PDF (Russian)

References

Namiot, Dmitry, Vladimir Sukhomlin, and Sergey Shargalin. "On Software Agents in ERP Systems." International Journal of Open Information Technologies 4.6 (2016): 49-54.

Namiot, Dmitry, et al. "Information robots in enterprise management systems." International Journal of Open Information Technologies 5.4 (2017): 12-21.

Maddukuri, Narendra. "Ai-Powered Decision Making In Rpa Workflows: The Rise Of Intelligent Decision Engines." Intelligence 1.1 (2023): 72-86.

Chen, Chaoran, et al. "Towards a design guideline for rpa evaluation: A survey of large language model-based role-playing agents." arXiv preprint arXiv:2502.13012 (2025).

Han, Shanshan, et al. "LLM multi-agent systems: Challenges and open problems." arXiv preprint arXiv:2402.03578 (2024).

Namiot, Dmitry, and Eugene Ilyushin. "On Cyber Risks of Generative Artificial Intelligence." International Journal of Open Information Technologies 12.10 (2024): 109-119.

'Positive review only': Researchers hide AI prompts in papers https://asia.nikkei.com/Business/Technology/Artificial-intelligence/Positive-review-only-Researchers-hide-AI-prompts-in-papers Retrieved: Jun 2025

Jiang, Chengze, et al. "Survey of adversarial robustness in multimodal large language models." arXiv preprint arXiv:2503.13962 (2025).

Agentic AI Security: Key Threats, Attacks, and Defenses https://adversa.ai/blog/agentic-ai-security/ Retrieved: Jun, 2025

AutoGPT: Build, Deploy, and Run AI Agents https://github.com/Significant-Gravitas/AutoGPT Retrieved: Jun, 2025

Pa Pa, Yin Minn, et al. "An attacker’s dream? exploring the capabilities of chatgpt for developing malware." Proceedings of the 16th cyber security experimentation and test workshop. 2023.

Lebed, S. V., et al. "Large Language Models in Cyberattacks." Doklady Mathematics. Vol. 110. No. Suppl 2. Moscow: Pleiades Publishing, 2024.

Namiot, Dmitry, and Eugene Ilyushin. "Generative Models in Machine Learning." International Journal of Open Information Technologies 10.7 (2022): 101-118.

Tian, Fangqiao, et al. "An outlook on the opportunities and challenges of multi-agent ai systems." arXiv preprint arXiv:2505.18397 (2025).

Namiot, Dmitry, and Eugene Ilyushin. "On Architecture of LLM agents." International Journal of Open Information Technologies 13.1 (2025): 67-74.

Elfathi, Chaimae, et al. "Intelligent Agents in Smart Logistics and Warehouse Automation: Overview." 2025 5th International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET). IEEE, 2025.

Stafford, V. "Zero trust architecture." NIST special publication 800.207 (2020): 800-207.

Agentic AI – Threats and Mitigations https://genai.owasp.org/resource/agentic-ai-threats-and-mitigations/ Retrieved: Jun, 2025

Multi-Agentic system Threat Modeling Guide v1.0 https://genai.owasp.org/resource/multi-agentic-system-threat-modeling-guide-v1-0/ Retrieved: Jun, 2025

CWE-441: Unintended Proxy or Intermediary ('Confused Deputy') https://cwe.mitre.org/data/definitions/441.html Retrieved: Jun, 2025

OWASP Non-Human Identities Top 10. Forging a New Standard in Cloud Security https://orca.security/resources/blog/owasp-non-human-identities-top-10/ Retrieved: Jun, 2025

LLM08:2025 Vector and Embedding Weaknesses https://genai.owasp.org/llmrisk/llm082025-vector-and-embedding-weaknesses/ Retrieved: Jun, 2025

2025 Top 10 Risk & Mitigations for LLMs and Gen AI Apps https://genai.owasp.org/llm-top-10/ Retrieved: Jun, 2025

Namiot, Dmitry, and Eugene Ilyushin. "Trusted Artificial Intelligence Platforms: Certification and Audit." International Journal of Open Information Technologies 12.1 (2024): 43-60.

Technical Blog: Strengthening AI Agent Hijacking Evaluations https://www.nist.gov/news-events/news/2025/01/technical-blog-strengthening-ai-agent-hijacking-evaluations Retrieved: Jun, 2025

Overview of Agent Hijacking Attacks https://www.nist.gov/image/overview-agent-hijacking-attacks Retrieved: Jun, 2025

Namiot, Dmitry, and Elena Zubareva. "About AI Red Team." International Journal of Open Information Technologies 11.10 (2023): 130-139.

Faulty reward functions in the wild https://openai.com/index/faulty-reward-functions Retrieved: Jun, 2025

Sharma, Mrinank, et al. "Towards understanding sycophancy in language models." arXiv preprint arXiv:2310.13548 (2023).

Eisenstein, Jacob, et al. "Helping or herding? reward model ensembles mitigate but do not eliminate reward hacking." arXiv preprint arXiv:2312.09244 (2023).

Maloyan, Narek, and Dmitry Namiot. "Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections." arXiv preprint arXiv:2504.18333 (2025).

An Introduction to Google’s Approach to AI Agent Security https://storage.googleapis.com/gweb-research2023-media/pubtools/1018686.pdf Retrieved: Jun, 2025

AI Agents Are Here. So Are the Threats. https://unit42.paloaltonetworks.com/agentic-ai-threats/ Retrieved: Jun, 2025

Song, Hao, et al. "Beyond the Protocol: Unveiling Attack Vectors in the Model Context Protocol Ecosystem." arXiv preprint arXiv:2506.02040 (2025).

Hou, Xinyi, et al. "Model context protocol (mcp): Landscape, security threats, and future research directions." arXiv preprint arXiv:2503.23278 (2025).

Wang, Zifan, et al. "A Red Teaming Roadmap Towards System-Level Safety." arXiv preprint arXiv:2506.05376 (2025).

Suhomlin, Vladimir Aleksandrovich. "Koncepcija i osnovnye harakteristiki magisterskoj programmy" Kiberbezopasnost'" fakul'teta VMK MGU." International Journal of Open Information Technologies 11.7 (2023): 143-148.

Iskusstvennyj intellekt kak strategicheskij instrument jekonomicheskogo razvitija strany i sovershenstvovanija ee gosudarstvennogo upravlenija. Chast' 2. Perspektivy primenenija iskusstvennogo intellekta v Rossii dlja gosudarstvennogo upravlenija / I. A. Sokolov, V. I. Drozhzhinov, A. N. Rajkov [i dr.] // International Journal of Open Information Technologies. – 2017. – T. 5, # 9. – S. 76-101. – EDN ZEQDMT.

Namiot, D. E. Ataki na sistemy mashinnogo obuchenija - obshhie problemy i metody / D. E. Namiot, E. A. Il'jushin, I. V. Chizhov // International Journal of Open Information Technologies. – 2022. – T. 10, # 3. – S. 17-22. – EDN DZFSKQ.

Refbacks

There are currently no refbacks.

Abava Кибербезопасность ИТ конгресс СНЭ

ISSN: 2307-8162