There is a security assumption baked into almost every enterprise security framework that is no longer valid.
The assumption is that actions with meaningful consequences require human intent. A transaction is processed because a human approved it. A file is deleted because a human chose to delete it. An email is sent because a human composed and sent it. The human is the actor, and the human is the point of control.
Agentic AI breaks this assumption completely. An agent processes the transaction, deletes the file, and sends the email — autonomously, as part of executing a goal it was given. The human set the goal. The agent made the decisions. And if those decisions were manipulated, compromised, or simply wrong, the consequences are real before anyone notices.
The security implications are profound, and most enterprise security frameworks have not caught up.
Understanding the New Attack Surface
Before designing controls, IT security leaders need to understand what is genuinely new about the agentic AI attack surface.
Prompt injection is the most significant novel attack vector. Unlike traditional software that executes deterministic code, AI agents interpret natural language instructions and content as part of their decision-making. This creates an attack path where malicious content embedded in the agent's environment — a document it reads, a webpage it visits, an email it processes — can contain hidden instructions that alter the agent's behaviour.
A well-designed prompt injection attack is invisible. The document looks normal to a human reviewer. The embedded instruction is formatted to be interpreted by the AI model, not by the human reading it. The agent follows the injected instruction believing it to be legitimate. The manipulation leaves no obvious trace in standard audit logs.
This is not a hypothetical. Security researchers have demonstrated successful prompt injection attacks against production agent systems across multiple platforms. Financial services firms, healthcare organisations, and government agencies using agentic AI without prompt injection controls are exposed right now.
Non-human identity sprawl is the second major new risk. Every agent needs credentials to do its work — API keys, service accounts, OAuth tokens, database access. In traditional environments, identity and access management is built around human users. Provisioning follows an approval process tied to a human requester. Deprovisioning happens when an employee leaves.
Agents do not leave. They accumulate access. Without explicit governance around agent identity lifecycle, organisations quickly develop a sprawl of service accounts with excessive permissions, no clear ownership, and no systematic review process. Each of these accounts is a potential attack surface.
Excessive agency is the third risk category. Agents given broad permissions to complete goals will use those permissions in ways that designers did not anticipate. An agent with read access to the entire file system and the ability to send emails has the technical capability to exfiltrate sensitive data even if that was never the intent. The principle of least privilege, well-understood in traditional access management, becomes even more critical — and harder to implement — when the actor is an AI system whose behaviour is probabilistic.
Building a Security Framework for Agentic AI
The framework needs to operate at three levels: before the agent is deployed, while the agent is running, and when things go wrong.
Before Deployment: Governance and Design Controls
Threat modelling for agent workflows should be mandatory before any agent goes into production. For each agent, the security team should map the actions the agent can take, the data it can access, the external systems it can call, and the consequences of each possible action being manipulated. The threat model is the foundation for all subsequent controls.
Minimum viable permissions should be the design principle for every agent identity. An agent that processes invoices does not need access to HR systems. An agent that answers customer queries does not need write access to the database. Designing agent permissions to the minimum necessary for the task — and enforcing those limits at the infrastructure level, not just through prompting — is the most important single control you can implement.
Prompt hardening reduces but does not eliminate prompt injection risk. System prompts should explicitly instruct the agent to ignore instructions embedded in external content, to refuse requests that fall outside its defined scope, and to escalate to a human when it encounters ambiguous or unusual instructions. These instructions will not stop sophisticated attacks, but they raise the bar significantly.
Human approval gates for high-consequence actions are non-negotiable in early deployments. Any action that is difficult or impossible to reverse — sending external communications, making financial transactions, deleting data, modifying production systems — should require explicit human approval before execution. As confidence in the agent's behaviour builds over time, the scope of autonomous action can be expanded deliberately and with evidence.
During Operation: Monitoring and Detection
Semantic logging goes beyond traditional action logging to capture the reasoning behind agent decisions. A log that records "agent deleted file X" is useful. A log that records "agent deleted file X because it interpreted instruction Y in document Z as authorisation to do so" is significantly more useful for both incident investigation and for identifying potential manipulation.
Anomaly detection on agent behaviour should be a standard component of any agent monitoring stack. Establish a baseline of normal agent behaviour — typical action volumes, typical data access patterns, typical external API calls — and alert on significant deviations. An agent that suddenly starts accessing data it has never touched before, or making external calls to unusual endpoints, warrants immediate investigation.
Rate limiting and cost controls at the infrastructure level prevent a compromised or misbehaving agent from causing unlimited damage. Hard limits on the number of actions per time period, the volume of data accessed, and the cost of external API calls are simple controls that significantly reduce the blast radius of an incident.
Continuous permission review should apply to agent identities on the same cadence as human identity reviews — at minimum quarterly. Agent permissions have a tendency to expand over time as functionality is added. Regular review keeps agent access aligned with current requirements rather than accumulated history.
When Things Go Wrong: Response and Recovery
Agent kill switches — the ability to immediately suspend all activity for a specific agent or class of agents — should be designed into every production deployment. The ability to stop an agent instantly and safely, without disrupting the broader system, is a fundamental resilience requirement.
Rollback capabilities for agent-initiated changes should be assessed before deployment, not after an incident. For each action class an agent can perform, the security design should answer the question: if this action turns out to have been unauthorised or erroneous, can it be reversed, and how quickly?
Incident response playbooks specific to agentic AI scenarios should be developed before they are needed. When a prompt injection attack is suspected, what is the immediate containment action? Who is notified? What evidence needs to be preserved? How is the agent's decision chain reconstructed? These questions have different answers from traditional security incidents and benefit enormously from prior preparation.
The Governance Layer
Technical controls are necessary but not sufficient. The governance structures that surround agentic AI deployment are equally important.
Every agent in production should have a named owner — a human who is accountable for the agent's behaviour, its access, and the quality of its outputs. The owner is responsible for reviewing anomaly alerts, approving permission changes, and triggering the kill switch when needed. Agents without named owners are governance blind spots.
An agent registry — a centralised inventory of every agent deployed, its permissions, its purpose, its owner, and its last review date — provides the visibility needed for portfolio-level security governance. Without the registry, security teams cannot know what they are protecting.
Vendor security assessment for third-party agentic AI products should apply the same rigour as any other enterprise software procurement. Specifically, the assessment should examine how the vendor handles prompt injection, how agent permissions are scoped, what audit logging is available, and what the vendor's incident response obligations are.
The Honest Reality
Most organisations deploying agentic AI today have none of these controls fully in place. The technology is moving faster than the security frameworks that should govern it.
This is not an argument to slow down AI adoption. It is an argument to invest in security infrastructure in parallel with deployment, not sequentially. The organisations that get this right are not the ones that waited until their security framework was perfect before deploying agents. They are the ones that deployed deliberately, with explicit acknowledgement of what controls were in place and what risks were being accepted, and systematically closed the gaps over time.
The window to build security infrastructure ahead of the risk is still open. It will not stay open indefinitely.
Conclusion
Agentic AI is not just a productivity tool. It is an autonomous actor with real credentials, real access, and the ability to take real actions with real consequences.
Securing an environment that contains agentic AI requires a security posture that treats agents as a distinct category of identity — different from humans, different from traditional software — with its own threat model, its own access governance, and its own monitoring requirements.
The IT and security leaders who build this posture now will be significantly better positioned than those who discover the gaps after the first incident.



