← All insights

The Diagnosis Gap

Most organisations do not have an AI tooling problem. They have an AI diagnosis problem.

Over the past year, the pattern has been consistent across regulated, high-trust, and publicly accountable environments. Teams are experimenting. Tools are being tested. Pilots are being launched. But underneath the activity, there is often very little clarity on where AI should genuinely be used, where it should not, and what must stay under human control.

What I see, repeatedly, is capability being explored before applicability is properly understood.

The expanded thinking below draws on the exchange that followed the original post on LinkedIn. Several of the framings here were sharpened, or in some cases first surfaced, by contributors whose comments made the analysis stronger than I could have produced alone. Where their framings have shaped the thinking, I have named them.

Capability before applicability

In one engagement, a team began automating a downstream process that handled payment-sensitive data before anyone had mapped the upstream dependencies, the exception-handling pathways, or who would be accountable when the output was wrong. The technology worked. The organisational readiness did not exist. It was pulled back within weeks.

A model can summarise, generate, classify, automate. But that does not mean it should be placed inside a live process that affects customers, payments, decisions, or outcomes.

In regulated and high-trust environments, this gap matters more than most organisations recognise. The question is not “can this be automated?” It is “are we prepared to stand behind this when it operates in the real world?” The majority of organisations have not yet done the work to answer that properly. Not because they lack ambition, but because they have not stepped back and mapped how their processes actually work today, where the real bottlenecks sit, where AI creates genuine value versus noise, and where risk is being introduced rather than reduced.

Until that diagnostic work is done, AI adoption becomes fragmented. Interesting in parts. Impressive in demos. Difficult to scale with confidence.

Sequencing is a leadership decision

Kunel Patel, a Chief Information Security Officer working in regulated financial services, sharpened the point further. Most AI failures, in his framing, are not technology failures - they are sequencing failures. And sequencing is a leadership decision, not a technical one.

That distinction matters because it changes where the responsibility sits. The diagnostic gap is rarely caused by engineers making poor choices. It is caused by sequencing never being questioned at the level where priorities are set. When leadership treats AI adoption as a capability race rather than an applicability decision, the technical teams inherit a sequence that was never properly examined. By the time the gap surfaces, it has already been built into the programme.

Michael Richards, Principal Architect at the Mojaloop Foundation, added a related observation that is worth holding alongside this one. AI-generated code has collapsed the marginal cost of mis-specification. People are not better at understanding their processes than they were before. They simply discover the gaps faster. The dangerous assumption is that because code arrives in minutes, the testing and process redesign that used to occupy the intervening months will somehow happen by itself. The buffer that pre-AI productivity gains used to mask bad design is now removed. Mis-specification is exposed instantly, but organisational maturity to respond has not accelerated at the same rate.

What the diagnostic actually exposes

Whose judgement is actually holding the operation together, and has it ever been documented?

That question is rarely abstract. One contributor to the original thread described an insurance claims pilot in precisely those terms. The model performed brilliantly in testing. Then, in production, it began auto-routing edge cases that the most experienced human handlers had always flagged for manual review. Nobody had mapped those judgement calls. As the contributor put it: they were invisible until they weren’t. The tooling was fine. The diagnosis was missing.

That is the shape of the gap. It is not a model failure or a deployment failure. It is the prior absence of a documented map of where human expertise was quietly carrying the operation, and the loss of that expertise the moment a model takes over without it.

The pattern recurs across sectors. Eugene Chan PhD, a behavioural scientist working on AI trust, captured it as cleanly as anyone in the discussion: the tech is ready, but the organisation is not. In regulated environments, that gap is where trust and accountability quietly fall apart - not in the model, but in the prior failure to be honest about what has not yet been understood.

Pierre Oberholzer PhD, working at the intersection of AI and ISO 20022, added a further dimension. Once AI moves into a live process, the diagnosis must extend into testing - clearly defined objectives, boundaries, and verification of outcomes both intermediate and final, not just “it works in a demo.” In regulated environments, trust has to be engineered. It cannot be assumed.

The structural alternative

MyongHak Jung, founder of I.N.G., made a deeper observation about the shape of the problem itself. We are still trying to solve an unbounded problem with bounded tools. If the space of possible states is effectively infinite, then improving models, adding rules, or increasing testing coverage will always remain asymptotic at best. You do not converge. You just delay failure.

The alternative he proposes is not better prediction, but structural constraint. Instead of asking how to validate outcomes across infinite permutations, the question becomes how to anchor the moment of execution so that the state itself is no longer open to reinterpretation. When the moment is fixed - with time, location, and context bound at the point of occurrence - the system is no longer reasoning about what might have happened. It is referencing what cannot be changed.

That framing sits adjacent to the diagnostic argument rather than displacing it. The diagnostic addresses the question of where AI should operate. The structural-constraint argument addresses what becomes provable once it does. Both are needed.

What the leadership decision sounds like

A reader on the original thread, Dr Vaishali Dixit, asked the question almost everyone in this space is quietly thinking: has anyone actually seen the diagnostic work get done before deployment, or does it almost always happen after something goes wrong?

Honestly, rarely. In most cases I have seen, the diagnostic work either happens after a pilot has already been pulled back, or it gets compressed into a tick-box exercise so the team can move to the part they are excited about. The organisations that do it properly tend to share one characteristic: someone senior enough insisted on understanding before deploying, and was willing to slow things down long enough for the mapping work to actually happen.

That is not a technology decision. It is a leadership decision. And it is the one that most often does not get made until something has already gone wrong.

What it exposes

Tim Zlomke, founder of Moral Clarity AI, captured the underlying point as cleanly as anyone in the discussion. AI does not introduce the risk. It exposes that the organisation never resolved it.

That is the diagnosis gap. And it is closable - but only by the people willing to look first.