Anita Srinivasan, LL.M. Class of 2026
AI systems are no longer working alone. Termed “multi-agent systems”, the emerging architecture for AI deployment uses a primary AI agent that receives a user’s request, breaks it into subtasks, and delegates those subtasks to specialized AI agents, often built by entirely different companies. Google’s November 2025 whitepaper on agent architecture describes these multi-agent systems as operating with a “team of specialists” approach. OpenAI’s Agents SDK and Google’s Agent-to-Agent (A2A) protocol now provide the infrastructure for this kind of cross-provider composition. This blog post argues that the legal frameworks currently being built for AI agent liability assume a single-agent model that multi-agent deployment has already outgrown, and that policymakers should mandate traceability infrastructure before the inevitable harms force courts to improvise.
Current discourse on AI liability assumes a straightforward chain of command in which a developer builds an AI system, a deployer integrates it, a user directs it, and when harm occurs, liability attaches to one or more of these three actors. California’s AB 316, effective January 1, 2026, codifies this model by prohibiting defendants who “developed, modified, or used” an AI system from asserting that the AI autonomously caused the harm. The EU AI Act similarly structures its obligations around a provider-deployer framework that, as recent analysis has noted, leaves gaps when applied to autonomous agents whose runtime behavior was not contemplated in the original system design. Recent scholarship applying traditional agency law principles (i.e., authority, ratification, apparent authority) to AI agents again models a single principal directing a single agent. While these frameworks represent important progress, they share the common assumption that a human authorized a specific AI system to act, and that the system’s developer can be identified.
Multi-agent architectures, interestingly, break both assumptions. When a coordinator agent autonomously selects and delegates to specialist agents across provider boundaries, the delegation itself is an emergent runtime decision in that no human chose the specific combination of agents that executed the task. If Agent A (built by Company X) hands off to Agent B (built by Company Y), which calls Agent C (built by Company Z), and harm emerges from their interaction rather than any single agent’s output, several doctrinal problems arise.
(1) Product liability’s component parts doctrine assumes that components are static and predetermined. However, AI agents are not fixed parts in an assembly; typically, they are selected dynamically by other AI agents at runtime. A court applying the component parts doctrine would need to determine whether the orchestrating agent’s developer is the “assembler” of a product that includes third-party components – even though the developer never specified which components would be used and the assembly happened autonomously, in real time.
(2) Respondeat superior requires identifying a principal who authorized the agent’s specific actions. When delegation chains cross provider boundaries autonomously, the authorization chain breaks. AB 316 forecloses the defense that “the AI did it” but when three different companies’ systems interact, it does not specify which company cannot invoke that defense. The EU AI Act faces a parallel problem in the way that when an agent autonomously invokes a tool from another provider at runtime, liability disperses among model providers, system providers, deployers, and tool providers, with no single actor having full visibility over the agent’s decision chain. A deployer may not even know which downstream agents were invoked on its behalf. (3) Joint tortfeasor frameworks could theoretically apply, but identifying each developer’s causal contribution to an emergent harm requires interaction-level traceability that most multi-agent systems do not currently provide. Without a record of what instructions and data passed between agents at each handoff, a plaintiff faces the near-impossible task of establishing which agent in a delegation chain caused the harm (and a court has no evidentiary basis for apportioning fault.)
This traceability gap is indeed the crux of the problem, and distinguishes multi-agent AI from other complex liability scenarios. In domains involving interconnected autonomous systems (such as automated trading, decentralized finance protocols, IoT device networks) attribution failures after cascading harms have repeatedly demonstrated that accountability requires built-in traceability at the infrastructure level, not post-hoc reconstruction. Multi-agent AI systems currently lack this. Agent-to-agent interactions are typically opaque, unlogged, and difficult to reconstruct after the fact.
Some of the infrastructure needed for such traceability is already taking shape. NIST’s AI Agent Standards Initiative, announced in February 2026, has identified agent identity and authentication as a core research priority, and its concept paper on AI agent identity and authorization lays out how existing identity management standards could be adapted for multi-agent environments. On the logging side, the OpenTelemetry project has published semantic conventions for agent observability that standardize how agent actions, tool calls, and decision points are traced across execution chains–essentially the distributed tracing infrastructure that liability analysis would require. Anthropic’s Model Context Protocol and Google’s A2A protocol already define how agents communicate across provider boundaries, and the W3C AI Agent Protocol Community Group is working toward open interoperability standards for agent identity and discovery. Singapore’s Model AI Governance Framework for Agentic AI, launched in January 2026 as the first government framework specifically targeting agentic systems, signals that regulators are beginning to take notice. However, without agent identity as a foundational layer, none of these traceability mechanisms can connect a harmful action to the developer responsible for it. What remains missing is the legal mandate to deploy it.
Policymakers could consider the following three targeted interventions to address the aforementioned challenges in relation to liability, identity and traceability.
(a) Mandatory interaction logging at every agent-to-agent handoff. Regulators should require that each handoff records which agent acted, what instructions it received, what outputs it produced, and which developer built it. This creates the evidentiary foundation for joint tortfeasor analysis.
(b) Standardized agent identity across multi-agent deployments. Each agent in a delegation chain should be attributable to a specific developer through a verifiable identity standard. Without this, even perfect logging cannot connect a harmful action to the party responsible for the agent that took it.
(c) Explicit liability allocation rules for cross-provider composition. Legislators extending AB 316’s logic should clarify how liability allocates across multiple developers whose agents composed without direct human authorization – whether through joint and several liability, proportional fault, or a rebuttable presumption that the orchestrating agent’s developer bears primary responsibility.
The single-agent liability model establishes the important principle that humans cannot disclaim responsibility by pointing to AI autonomy. Nevertheless, multi-agent AI systems need more than just principles and something more akin to logging at every handoff, identity standards that connect agents to developers, or allocation rules that tell courts how to apportion fault when no single human authorized the chain of delegation that caused the harm.