The Model Was Never the Problem

At the AI publishing platform I was running, we built a quality check that worked beautifully. Within weeks the flags were being muted. I keep coming back to it – because every piece of content I read about agentic AI in manufacturing has the same shape.

There's a moment I keep coming back to.

At the AI publishing platform I was running – the one turning personalised children's books into thousand-copy print runs – we built a quality check into the pipeline. It flagged any book where the generated text and the generated illustrations had drifted apart. A mismatch so subtle you'd miss it skimming, caught by the model every time.

The check worked. It kept flagging.

And slowly, over a few weeks, the flags stopped getting picked up. Production was fast. The reviewer queue was slow. Nobody had decided, when we built the check, whose day would get interrupted by a flag. The operator assumed it was the reviewer. The reviewer assumed it was engineering. Engineering assumed it was operations.

The flags piled up. Then they got muted. Then the check got turned off.

The model wasn't the problem. The model was excellent.

I think about that check a lot now, because every piece of content I read about agentic AI in manufacturing has the same shape as that first deployment.

The top-line numbers are stunning. Deloitte projects manufacturing agentic-AI adoption jumping from six percent to twenty-four percent this year. McKinsey, Google Cloud and others are reporting 170-plus-percent ROI on the deployments that land. The Siemens and NVIDIA partnership announced the world's first fully AI-driven, adaptive factory, with Siemens' electronics site in Erlangen as the blueprint.

Everyone is writing as if 2026 is the year.

And then, in the footnote of the same reports, the second number.

Thirty-eight percent of manufacturers are piloting agents. Eleven percent have anything in production. Gartner says forty percent of agentic AI projects will be cancelled by the end of next year.

That is not a model problem. That is the camera getting unplugged, at scale, across an entire industry.

The most useful frame I have found for this comes from a line buried in a McKinsey piece. About eighty percent of the work of shipping an agentic system is not modelling. It is data engineering, stakeholder alignment, governance and workflow integration.

Put differently – the model is the last twenty percent. The model is the easy part.

The hard part is the part nobody wants to write a blog post about. Who owns the alert. Which system of record does the agent write to. What happens when the agent is wrong. Whose name is on the audit trail. How do you rotate the credentials of forty autonomous processes. What does "good" even look like when nobody agreed on the metric before the pilot started.

This is not new. This is every wave of new technology entering an old industry, from PLCs to MES to MRP II to Industry 4.0. The shiny part lands fast. The boring part – the governance, the handoffs, the roles, the small daily rituals – takes years. And that is where pilots go to die.

What is different in 2026 is the speed mismatch.

An agentic system does not sit there waiting for someone to pick up the flag. It acts. It places orders. It re-sequences production. It emails suppliers. One compromise cascades through a fleet of agents at machine speed. One ambiguous handoff gets exercised thousands of times before anyone looks up.

Autonomy does not create the governance problem. It exposes the one you already had.

The large manufacturers, the ones where all the analyst stories are written, feel this most. They have the most process debt, the most service accounts, the most legacy toolchains without callable APIs, the most meetings required to change anything. They have the most to redesign and the most people who have to agree on the redesign.

This is, quietly, why I find the whole space more interesting from the small end.

A fifty-person factory is not trapped in pilot purgatory the same way.

It has fewer systems of record. Fewer stakeholders. Fewer embedded rituals to unpick. It can greenfield the governance layer alongside the agents instead of bolting it on afterwards. It can decide from day one that every agent action writes to a single audit log, that every credential rotates on a schedule, that every ambiguous flag lands in a specific human's queue.

The large players are building the Siemens-scale stack. The open question is what happens when the same agent patterns – supply orchestration, autonomous quality, production replanning – get assembled from commodity parts. Foundation models. Open protocols. Commodity vision. A small MES. A file of rules a human can actually read.

The unit economics shift. The fixed cost of running a line collapses. Small teams can run small batches the way small teams already run software.

That is the version of agentic manufacturing I am watching for.

Not the AI-driven mega-factory. The factory that looks more like a codebase. Versioned. Forkable. Small by design. Governed by construction rather than by committee.

The hardest thing I have had to re-learn, moving from polymers to AI, is that most of what looks like a technology problem is a people-and-process problem wearing a technology costume.

The models are getting better on a schedule we cannot change. The governance is on a schedule we can.

If twenty-four percent of manufacturers deploy agents this year and only eleven percent ship them, the gap is not the model. The gap is whether anyone decided, before the pilot, who owns the flag.

The check in our pipeline didn't fail because it couldn't see.

It failed because nobody was looking.