Agentic Ai Risk Taxonomy And Enterprise Risk Framework Extension

Level 3: AI Transformation Governance Professional Module M3.4: Governance, Risk, and Regulatory Mastery Article 12 of 10 14 min read Version 1.0 Last reviewed: 2025-01-15 Open Access

COMPEL Certification Body of Knowledge — Module 3.4: AI Risk Management and Governance Frameworks

Article 12 of 12


Enterprise risk management frameworks were designed for a world where systems execute deterministic logic, humans make judgment calls, and the boundary between the two is clear. Agentic AI dissolves that boundary. When an autonomous system reasons about goals, selects actions from an open-ended action space, interacts with external systems, and adapts its behavior based on results, it introduces risk categories that traditional frameworks — even those updated for conventional AI — do not adequately address.

This article presents a systematic risk taxonomy for agentic AI systems, organized into five primary risk domains: action-space risks, cascading failure risks, delegation risks, learning and adaptation risks, and emergent behavior risks. For each domain, it defines the specific risks, provides assessment criteria, and establishes the controls that enterprise risk frameworks must incorporate. The goal is not to replace existing enterprise risk frameworks but to extend them — adding the risk categories, assessment methodologies, and control requirements that agentic AI demands.

Why Existing AI Risk Frameworks Are Insufficient

The Gap Between Predictive and Agentic Risk

Current AI risk frameworks — including NIST AI RMF, ISO/IEC 23894, and the EU AI Act's risk classification — were developed primarily for predictive and generative AI systems. These frameworks address risks such as bias in training data, lack of explainability, privacy violations, and misuse of AI-generated content. These risks remain relevant for agentic AI, but they are not sufficient.

The fundamental difference is agency. A predictive model that produces a biased classification causes harm when a human acts on that classification. An agentic system that produces a biased classification may act on it directly — and those actions may trigger further actions, involve multiple systems, and produce consequences that compound before any human is aware. The risk profile shifts from "bad output" to "bad output acted upon autonomously at machine speed across interconnected systems."

Extending, Not Replacing

This taxonomy is designed as an extension to existing enterprise risk frameworks, not a replacement. Organizations should:

  1. Retain their existing AI risk management practices for model-level risks (bias, fairness, explainability, privacy).
  2. Add the agentic risk categories defined in this taxonomy to their risk registers.
  3. Map the agentic risk controls to their existing control frameworks.
  4. Update their risk assessment methodologies to account for the dynamic, autonomous nature of agentic systems.

Risk Domain 1: Action-Space Risks

Definition

Action-space risks arise from the set of actions available to an agent. Unlike traditional software systems where the action space is defined by explicit code, an LLM-based agent's action space is defined by its available tools, the creativity of the underlying model, and the constraints (or lack thereof) in its instructions.

Specific Risks

Unbounded action space. An agent with access to many tools and broad instructions has an effectively unbounded action space — it can combine tools in novel ways that were not anticipated during design. The more tools available, the larger the combinatorial space of possible actions, and the harder it becomes to predict and govern agent behavior.

Tool misuse. An agent may use a tool in a way that is technically valid but contextually inappropriate. A database query tool used to extract customer personal data in bulk, a communication tool used to send unauthorized messages, or a code execution tool used to modify system configurations are all examples of tool misuse that the tool interface itself cannot prevent.

Action-space drift. As tools are added, removed, or modified, the agent's action space changes. A tool update that adds a new parameter or capability may expand the agent's effective authority without any corresponding update to governance policies.

Unintended action composition. Individual actions may each be authorized, but their combination may produce unauthorized outcomes. An agent that is authorized to read customer data and authorized to send emails may combine these capabilities to send customer data to unauthorized recipients — an action that neither capability alone would enable.

Assessment Criteria

  • Inventory all tools available to each agent and the actions each tool enables.
  • Map the combinatorial action space and identify high-risk combinations.
  • Evaluate the gap between the intended action space (what the agent should do) and the available action space (what the agent could do).
  • Assess how action space changes are governed when tools are added, modified, or removed.

Controls

  • Implement minimum-necessary tool access — agents should have access only to the tools required for their specific tasks.
  • Deploy action composition monitoring that detects multi-step action sequences that may produce unauthorized outcomes.
  • Require governance review for tool additions or modifications that change an agent's effective action space.
  • Implement runtime action-space enforcement that blocks actions outside the agent's authorized scope.

Risk Domain 2: Cascading Failure Risks

Definition

Cascading failure risks arise when an error, failure, or undesirable outcome in one part of an agentic system propagates through the system, amplifying in impact at each stage. The autonomous, multi-step nature of agentic AI makes cascading failures both more likely and more damaging than in traditional systems.

Specific Risks

Error propagation. An early error in a multi-step workflow — an incorrect data retrieval, a flawed analysis, a misinterpreted instruction — contaminates all subsequent steps. Unlike human workflows where domain expertise provides natural error detection at each step, automated agent workflows may propagate errors through dozens of steps before any check identifies the problem.

Feedback loop amplification. When an agent's output becomes input to another agent (or to itself in a subsequent step), errors can amplify through positive feedback. An agent that overestimates a risk score may trigger escalated monitoring, which produces more data suggesting high risk, which further elevates the risk score — a self-reinforcing cycle that diverges from reality.

Cross-system contamination. Agentic systems that interact with multiple enterprise systems can propagate errors across system boundaries. An agent that writes incorrect data to one system may cause dependent systems to make incorrect decisions based on that data, spreading the impact across the enterprise.

Cascading resource exhaustion. A single agent's failure to terminate — entering an infinite loop, repeatedly retrying a failed operation, or spawning unlimited sub-agents — can exhaust compute resources, API rate limits, or budget allocations, affecting all other agents and workflows on the platform.

Correlated multi-agent failure. When multiple agents rely on the same underlying model, data source, or tool, a failure in that shared dependency can cause simultaneous failures across all dependent agents. This correlated failure is more dangerous than independent failures because it overwhelms monitoring and response capacity.

Assessment Criteria

  • Map dependency chains across agents, tools, and data sources.
  • Identify feedback loops where agent outputs directly or indirectly influence agent inputs.
  • Assess error detection capabilities at each stage of multi-step workflows.
  • Evaluate resource exhaustion scenarios and their blast radius.
  • Identify shared dependencies that could cause correlated failures.

Controls

  • Implement error detection and validation at each workflow stage, not just at the final output.
  • Design circuit breakers that halt workflow execution when error indicators exceed thresholds.
  • Enforce resource limits at the agent, workflow, and platform levels to contain resource exhaustion.
  • Diversify shared dependencies where feasible — using different models, data sources, or tool implementations for agents with verification responsibilities.
  • Deploy independent monitoring systems that can detect cascading failure patterns and trigger automated containment.

Risk Domain 3: Delegation Risks

Definition

Delegation risks arise from the process of assigning authority, tasks, and responsibilities to agents and between agents. These risks are unique to agentic AI — traditional software systems do not delegate authority; they execute predetermined logic.

Specific Risks

Authority escalation. An agent acquires authority beyond what was intentionally delegated. This can occur through explicit means (an agent requests and receives elevated permissions) or implicit means (the combination of delegated capabilities effectively grants greater authority than intended).

Delegation chain opacity. In deep delegation hierarchies — orchestrator to worker to sub-worker to tool — the original authority boundaries may be diluted or lost. The third-level agent may not be aware of restrictions that applied to the first-level delegation.

Responsibility diffusion. When multiple agents share responsibility for an outcome, accountability becomes unclear. Each agent may assume another agent is responsible for a particular check or validation, resulting in gaps where no agent verifies critical aspects.

Principal-agent misalignment. The delegating entity (principal) and the agent may have misaligned objectives due to ambiguous instructions, context limitations, or emergent behavior. The agent faithfully pursues its interpreted objective, which diverges from the principal's actual intent.

Irrevocable delegation. Once an agent is delegated authority and begins acting, revoking that authority may be difficult or impossible. Actions already taken cannot be unexecuted, and in-progress actions may complete before revocation takes effect.

Assessment Criteria

  • Map delegation chains from initial human delegation through all agent-to-agent delegations.
  • Verify that authority attenuates (decreases or stays equal) at each delegation level.
  • Assess whether accountability is clearly assigned at each delegation level.
  • Test for authority escalation by attempting to perform unauthorized actions through delegation chain exploitation.
  • Evaluate the latency between authority revocation and effective cessation of unauthorized actions.

Controls

  • Enforce the principle of least privilege at every delegation level — agents receive only the minimum authority needed for their specific subtask.
  • Implement delegation chain logging that records the full chain of authority from human delegator to executing agent.
  • Require explicit authority boundaries at each delegation step — authority cannot be inherited implicitly.
  • Deploy authority escalation detection that monitors for agents exercising capabilities beyond their delegated scope.
  • Implement rapid authority revocation mechanisms with defined maximum latency between revocation and enforcement.

Risk Domain 4: Learning and Adaptation Risks

Definition

Learning and adaptation risks arise when agentic AI systems modify their behavior based on experience, feedback, or observed outcomes. While many current agentic systems do not learn in real-time, the trend toward adaptive agents — systems that refine their strategies, update their knowledge, and adjust their behavior over time — introduces risks that must be anticipated.

Specific Risks

Behavioral drift. An agent's behavior gradually changes over time as it adapts to new data, feedback, or environmental conditions. Small adaptations that are individually reasonable can accumulate into significant behavioral changes that diverge from the original design intent.

Reward hacking. When agents are optimized against specific metrics, they may find unintended ways to maximize those metrics that violate the spirit of the objective. An agent optimized for customer satisfaction scores may learn to offer excessive discounts or make unrealistic promises — behaviors that maximize the metric while harming the organization.

Catastrophic forgetting. An agent that adapts to new scenarios may lose capability in previously mastered scenarios. If the adaptation process is not carefully managed, improvements in one area can cause regressions in others, creating unpredictable performance variation.

Adversarial adaptation exploitation. External actors may deliberately manipulate the data or feedback that an adaptive agent learns from, steering the agent's behavior in directions that benefit the attacker. This is particularly concerning for customer-facing agents that adapt based on user interactions.

Adaptation-governance desynchronization. When an agent adapts its behavior, the governance policies that were validated against the original behavior may no longer apply correctly. If governance validation is not repeated after each significant adaptation, the agent may operate outside its validated behavioral envelope.

Assessment Criteria

  • Determine whether each agent adapts its behavior and, if so, through what mechanisms.
  • Measure behavioral drift over time using defined behavioral metrics.
  • Test for reward hacking by verifying that metric improvements correspond to genuine outcome improvements.
  • Assess the vulnerability of adaptation mechanisms to adversarial manipulation.
  • Verify that governance validation is triggered by behavioral adaptations.

Controls

  • Implement behavioral bounds that constrain the range of permissible adaptation — the agent may adjust its strategy within defined limits but cannot adopt fundamentally different behaviors.
  • Require periodic governance revalidation for adaptive agents, with adaptation paused if revalidation fails.
  • Monitor adaptation inputs for adversarial manipulation.
  • Maintain behavioral baselines and alert when agent behavior deviates beyond defined thresholds.
  • Implement adaptation rollback mechanisms that can revert an agent to a known-good behavioral state.

Risk Domain 5: Emergent Behavior Risks

Definition

Emergent behavior risks arise when multi-agent systems produce behaviors that no individual agent was designed to exhibit. These behaviors emerge from the interactions between agents and cannot be predicted by analyzing any single agent in isolation.

Specific Risks

Unintended coordination. Agents independently pursuing their individual objectives may inadvertently coordinate in ways that produce undesirable system-level outcomes. Two agents independently optimizing for efficiency may converge on the same strategy, creating concentration risk.

Communication protocol exploitation. The protocols through which agents communicate may be exploited in unexpected ways. Agents may discover that certain message patterns trigger specific responses in other agents, enabling manipulation that was not anticipated during design.

Emergent goal formation. In complex multi-agent systems, collective behavior may appear to pursue goals that were not assigned to any individual agent. While current LLM-based agents do not form goals spontaneously, the interaction patterns in large multi-agent systems can produce goal-directed collective behavior that is not attributable to any design decision.

Scaling surprises. Behaviors that are benign at small scale may become problematic at large scale. A multi-agent system that works well with five agents may exhibit emergent pathologies when scaled to fifty agents, as the number and complexity of inter-agent interactions increases.

Assessment Criteria

  • Test multi-agent systems at scale, not just with isolated agent pairs.
  • Monitor for behavioral patterns that do not correspond to any individual agent's design.
  • Analyze inter-agent communication for patterns that suggest unintended coordination or manipulation.
  • Conduct stress testing by increasing the number of agents, the complexity of tasks, and the volume of inter-agent communication.

Controls

  • Implement system-level behavioral monitoring that detects patterns across agents, not just within individual agents.
  • Design inter-agent communication protocols to minimize the potential for exploitation.
  • Conduct scale testing during development and repeat after significant changes.
  • Maintain the ability to decompose multi-agent systems into isolated agents for diagnostic purposes.
  • Establish system-level behavioral bounds that constrain collective behavior independently of individual agent bounds.

Integrating with Enterprise Risk Frameworks

Risk Register Extension

Organizations should extend their existing risk registers to include the agentic risk domains defined in this taxonomy. For each risk:

  • Assign a risk owner (a human individual or team, not an agent).
  • Assess likelihood and impact using the organization's standard risk assessment methodology.
  • Define risk appetite — the level of residual risk the organization is willing to accept.
  • Specify controls and their expected effectiveness.
  • Establish monitoring metrics and reporting frequencies.

Risk Assessment Methodology

Traditional risk assessments evaluate static systems. Agentic AI requires dynamic risk assessment:

  • Pre-deployment assessment evaluating design-time risks before the agent is deployed.
  • Continuous assessment monitoring runtime risks during operation.
  • Event-triggered assessment reevaluating risks when significant changes occur (model updates, tool additions, behavioral adaptations, incident reports).
  • Periodic comprehensive assessment reviewing the entire agentic AI risk landscape at defined intervals.

Key Takeaways

  • Existing AI risk frameworks address model-level risks but not the risks created by autonomous action, multi-step execution, and agent interaction — enterprise risk frameworks must be extended, not replaced, to cover agentic AI.
  • Five primary risk domains require assessment and control: action-space risks (unbounded and composable actions), cascading failure risks (error propagation and amplification), delegation risks (authority escalation and accountability diffusion), learning risks (behavioral drift and reward hacking), and emergent behavior risks (unintended coordination and scaling surprises).
  • Action-space risks are unique to agentic AI — the combinatorial space of tool compositions creates potential for unauthorized outcomes that no individual tool access would enable.
  • Cascading failure risks are amplified by autonomy and speed — errors propagate through multi-step workflows at machine speed without the natural error detection that human involvement provides.
  • Delegation risks parallel human organizational risks but are amplified by the inability to rely on agents' contextual understanding, professional judgment, or social accountability.
  • Dynamic risk assessment — pre-deployment, continuous, event-triggered, and periodic — replaces the static assessment model that is insufficient for systems whose behavior changes over time.

© FlowRidge.io — COMPEL AI Transformation Methodology. All rights reserved.