Agent Harness Series — Part 2 of 7

Agent Harnesses and Compliance in Banking

Banking invented enterprise governance. Now it has to apply it to AI agents.

Walk into any major bank’s operations centre and you’ll find controls everywhere. The four-eyes principle: every significant transaction reviewed by two people. Maker-checker: no one approves their own work. Segregation of duties: the person who initiates a payment cannot be the person who authorises it. These aren’t bureaucratic habits. They’re the accumulated wisdom of an industry that learned, sometimes the hard way, what happens when checks and balances fail.

Now introduce AI agents into that environment, systems that can independently plan, make decisions, and take actions across multiple tools and datasets in a single session, and ask yourself: where’s the maker-checker for the agent?

That question is no longer hypothetical. Agentic AI is arriving in banking faster than most compliance teams realise, and the governance frameworks haven’t kept pace.

The use cases are compelling

The commercial pressure to deploy agents in banking is significant, and understandable. The productivity gains in the right domains are substantial.

KYC and AML processes are seeing some of the most dramatic results. Banks report up to a 90% reduction in client onboarding time and 50% reductions in the time required per AML investigation when agents are put to work on document review, entity resolution, and alert triage. Credit risk memo creation, a process that traditionally consumes weeks of analyst time, is showing 20–60% productivity gains with 30% improvements in turnaround. Fraud detection, compliance reporting automation, customer service personalisation, and strategic scenario modelling are all active deployment areas.

McKinsey’s analysis puts the potential at a 30–40% cost reduction and 30% profitability improvement for financial services by 2030. The numbers explain why 70% of financial services organisations are exploring agentic AI deployments. However only 14% have achieved anything approaching full-scale implementation — and the gap between those two figures is not a technology problem. It’s a governance problem.

The compliance environment doesn’t bend

Australasian banks operate in one of the most demanding regulatory environments in the world, and the regulators are paying close attention to AI.

APRA & Privacy Act: Key Obligations

  • CPS 234 AI models, training data, and inferences are within the information security remit — now
  • CPS 230 Manage AI systems and vendors as operational failure points; vendor contract compliance deadline — 1 July 2026
  • Privacy Act Explain automated decisions to affected individuals; penalties up to $50M — 10 December 2026

APRA’s CPS 234 places AI models, training data, and inferences explicitly in scope. The obligation to maintain information assets securely doesn’t contain an exemption for AI-generated outputs. When an agent produces a credit recommendation, that recommendation is an information asset, and the processes that produced it are within CPS 234’s remit.

CPS 230, APRA’s operational risk management standard, introduces a further complication. It requires regulated entities to manage AI systems and their vendors as potential operational failure points. The vendor contract compliance deadline is 1 July 2026. Banks that have deployed third-party agent tooling without addressing the operational risk management requirements are already in a difficult position.

Privacy Act amendments scheduled for 10 December 2026 add a transparency dimension. Regulated entities will be required to explain automated decisions to affected individuals, with penalties for non-compliance reaching $50 million. An agent that generates a credit decision and leaves no interpretable audit trail isn’t just a governance risk. It’s a potential $50 million liability.

None of this is surprising. Regulators have always extended existing frameworks to cover new technology, and APRA’s 2026 supervisory priorities explicitly call out AI risk. The AI-specific exemption that some organisations seem to be waiting for is not coming.

A perfect storm for compliance

The challenge in banking isn’t that agentic AI is inherently unmanageable. It’s that three forces are colliding simultaneously, and the interaction between them is what makes this hard.

The first force is commercial pressure. The productivity gains are real, the competitive pressure to deploy is real, and the tolerance for slow rollouts is low. Business units that can see a 50% reduction in AML investigation time are not going to wait indefinitely for compliance teams to catch up.

The second force is the hallucination problem. Language models can and do generate plausible-sounding outputs that are factually incorrect. In most applications this is an inconvenience. In a regulated banking context where an agent is drafting credit memos, flagging AML alerts, or summarising regulatory obligations, a confident error is a material risk. The model doesn’t know what it doesn’t know, and it won’t stop to tell you.

The third force is the audit trail deficiency. This is where agentic AI creates a genuinely new problem, rather than an old problem in a new form. A human analyst making a credit decision leaves a trail: emails, system logs, approvals, documented rationale. An agent completing an equivalent decision sequence in seconds leaves a conversation transcript. That transcript is not queryable, not mappable to existing audit frameworks, and not defensible to a regulator asking for the decision logic.

Banks that deployed ChatGPT-style tools in 2023 and then quietly pulled them back, such as JPMorgan, Goldman Sachs, and Wells Fargo, were among the institutions that restricted access. They weren’t reacting to a technology failure, they were reacting to the governance gap. The models were capable but the controls weren’t there.

What enterprise-grade agent governance looks like in banking

The answer isn’t to slow down agentic AI deployment in banking. The competitive case is too strong, and the productivity gains are too real. The answer is to ensure that agents operate within a governance layer that is as rigorous as the rest of the bank’s control environment, with full provenance and audit.

That governance layer has a name: a semantic AI harness. Rather than wrapping an agent in generic controls, a semantic harness binds agent behaviour to the compliance standards and the organisation’s specific implementation of these in compliance, risk and controls policies. Key to this approach is defining these in a human and machine-readable form, as an ontology that provides the bank and its agents with a formal representation of regulatory obligations, risk appetite, decision authority, and data classification. With the semantic harness in place, the agent doesn’t just follow rules, it operates within a structured model of what those rules mean, which is a meaningful distinction when the rules are as organisation-specific as APRA’s requirements.

GRL’s Semantic AI Harness (SAH) demonstrates this directly. The SAH surfaces these regulatory controls via an MCP server, which agents read at startup, binding themselves to the organisation’s constraints, decision authority, and governance requirements. When the agent encounters a decision that touches an operational risk obligation, the harness captures it: the decision, the rationale, the alternatives considered, and the downstream consequences. This is not a transcript, rather a structured, queryable governance record that maps to existing audit frameworks.

The SAH takes this several steps further. A bank’s compliance team typically fields hundreds of questions a month from business staff, such as which compliance obligations apply to a new SME lending product before a credit committee meeting. In a world without governed AI, this question sits in a queue and waits for a compliance specialist. With a governed compliance chatbot, this question receives a precise answer grounded in the bank’s own compliance register: which obligations apply, who owns each one, current control status, and a flag that one obligation has a review date coming up that he should confirm before the committee.

The chatbot’s answer is not a general AI understanding of banking regulation. It is drawn only from what the bank’s Regulatory Architecture team has formally published. If a policy exists but hasn’t been formally registered, the chatbot says so and directs a user to the right human.

Similarly, a compliance analyst may ask whether a specific AML control gap from a Q3 audit has been formally closed. The chatbot checks the live compliance register: the gap is open, last updated this date, contact the Financial Crime Compliance team to confirm remediation and record closure. The analyst can submit her quarterly report with confidence rather than uncertainty.

The SAH allows for information and decision flows between responsible parties, and surfaces regulatory coverage gaps that need addressing. For example, an analyst asks about the bank’s policy on crypto asset custody under recent ASIC guidance. The answer: “I don’t have a published policy on this in the compliance register. This may mean the bank’s position is still being assessed, or a policy exists but hasn’t been formally published. I’d recommend contacting Sarah Lim in Regulatory Affairs.” The chatbot does not speculate or summarise documents it hasn’t been given, and the question and gap is routed to the responsible regulatory team for assessment. A governance dashboard shows the team a full audit and provenance trail: what queries have been made by which agent and user, what gaps exist, and what regulations they fall under, and provides full export for external and internal audits.

The other critical advantage of the SAH is agility. When the regulator updates a compliance rule, you update the ontology definition. Every API, every report, every AI agent that references that rule automatically reflects the change. The regeneration takes minutes. The audit trail shows exactly when the change was made and what it affected. You close the compliance gap in hours, not months, and you can prove to the regulator precisely when you were compliant.

In a regulated environment, an AI that knows the boundary of its authorised knowledge and holds that boundary is not a limitation, it is the product.

The organisations that will deploy agentic AI at scale in banking are the ones that treat the harness as the compliance infrastructure it is, not an afterthought added once the agent is already in production. Regulators are not going to distinguish between a decision made by an analyst and a decision made by an agent. The obligation to document, explain, and defend that decision applies equally to both.

The four-eyes principle took decades to become universal in banking. The maker-checker for AI doesn’t have that kind of time.


This is the second in a series on enterprise agent harnesses. The first article, Agent Harnesses Are an Enterprise Architecture Problem, establishes the governance gap. Next: how harnessed agents can accelerate modern enterprise architecture rather than threatening it.

Worth the tokens?
👍
👎