TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems
Published 7 Nov 2025 · arXiv · Ishan Kavathekar
Overview
TAMAS is a benchmark designed to evaluate the adversarial risks in multi-agent Large Language Model (LLM) systems. It highlights significant vulnerabilities in these systems when subjected to adversarial attacks, emphasizing the need for improved security measures.
Key Insights
- Vulnerability of Multi-Agent Systems: Multi-agent LLM systems are highly susceptible to adversarial attacks, necessitating stronger defenses.
- Benchmark Details: TAMAS includes 300 adversarial instances across six attack types and 211 tools, tested on ten LLMs.
- Effective Robustness Score (ERS): Introduced to assess the tradeoff between safety and task effectiveness.
BFSI Relevance
- Why Relevant: Multi-agent LLM systems are increasingly used in BFSI for tasks like fraud detection and customer service automation.
- Primary Sector: Financial Services
- Subsectors: Asset Management, Fraud Detection
- Actionable Implications: BFSI professionals should prioritize enhancing the security of AI systems to mitigate risks from adversarial attacks.
researcher peer-reviewed-paper global