Will Schenk and I just came back from NVIDIA GTC. With more than 1,000 sessions featuring many AI leaders, the conference can, in many ways, serve as a litmus test for where the AI industry is right now. Combining my analytical approach with Will’s practical experience helping companies work with AI more effectively through TheFocus.AI, we want to share the bigger picture we saw there.
In today’s episode, we will cover:
AI in the business vs. AI on the business
Why the agent version of Cold Start is political
The shadow organization the official one cannot touch
What the official path has to carry if it wants to win
Why the measuring currency is failing
Retrofit, not subsidiary
The real divide
Why AI progress is outpacing organizational readiness
For the last two years, the central question in AI was technological: how capable are the models?
That is still an important question. But for many companies, it is no longer the binding one.
The capabilities are arriving faster than organizations can absorb them. Models can reason better, search better, code better, summarize better, and operate across increasingly long chains of tasks. At GTC 2026, that was visible everywhere. Jensen Huang reframed NVIDIA itself as a "token factory." Session after session pushed the same message from different angles: intelligence is becoming an operational resource, something companies will produce, route, govern, and consume at scale. Yet inside most firms, the actual structure of work remains difficult to expose, difficult to verify, and difficult to translate into a form a machine can act on reliably.
This is where many AI discussions now are going the wrong way. Companies say they want AI when what they usually mean is that it’s so hyped that they’re afraid of falling behind. But what they often do not have is a clear enough understanding of how their own work actually gets done. They do not have clean process maps, clear exceptions, reliable ownership, strong feedback loops, or even a shared definition of what good execution looks like in a form a machine can follow. What they do have is habit, tacit knowledge, local judgment, undocumented workarounds, and senior people who can spot when something is wrong instantly but would have a hard time explaining exactly how they know.
The knowledge exists. It is just stored in people, not in systems.
That is why the most important AI work happening inside companies right now is not model selection. It is organizational translation. It is the work of turning messy institutional memory into context, turning context into bounded action, and turning human correction into a learning loop.
A historical analogy helps here, if used carefully. Electric motors arrived in factories in the 1880s, but the big productivity gains did not show up until the 1920s. It’s forty years. Early adopters often kept the old steam-era layout and simply replaced the power source. The real gains came later, when factories were redesigned around distributed electric power, with new floor plans, workflows, and assumptions about coordination and automation.
AI is pushing companies toward a similar threshold. If we think that the gains will come from dropping a new capability into an old organizational structure – this is just hoping for magic. The true benefit comes from redesigning the system around what the capability makes possible. That, underneath the demos and deployment stories, was one of the clearest lessons from GTC.
The hidden bottleneck in enterprise AI adoption
Most companies are not legible to machines.
This sounds abstract, but it is painfully concrete in practice. Work inside firms runs on partial documentation, institutional lore, ambiguous ownership, and constant exception handling. Teams say they have a process when what they often have is a stable pattern of improvisation. Humans can operate inside that environment because they absorb context socially. They know who to ask, which shortcut is acceptable, which dashboard lies, which metric matters, and which exception is normal enough to ignore. A model picking into the same environment sees fragments.
That is why so many AI pilots look impressive in demos and then fall apart in real use. The model can handle generic tasks, but the actual workflow is full of hidden dependencies, unwritten rules, and quality standards that were never built into the system. The company thought it was buying intelligence. What it actually discovered was how much of its own work had never been clearly defined.
And that’s a representation problem.
If the work is not described in a form the system can access, check, verify, and act on, better models will help less than people think. You can improve reasoning, speed, and multimodality, but none of that solves the basic problem: a machine cannot reliably work with knowledge that only lives in people’s heads.
GTC offered a very clear example. NVIDIA’s chip-design team said their first attempt in 2023, a fine-tuned domain expert, failed completely. Not because the model was terrible, but because hardware engineering is a domain where correctness is non-negotiable and answers without traceability are useless. If a system cannot show where an answer came from, engineers will not trust it.
Since then a few things changes – yes, the models got better – but it was the relationship between the model and the company’s knowledge that really had to change. Engineers curated their own documents. Responses became traceable to sources. Verifiability stopped being an afterthought and became part of the system.
Now we fixed the problem of traceability and verifiability, which meant engineers would trust their responses. And that was key to driving adoption.
Why AI copilots and pilots so often disappoint
The first wave of enterprise AI was mostly framed as assistance: chat interfaces, summarization tools, coding help, retrieval, and light automation. The goal was simple enough: save time, improve throughput, reduce friction.
Most companies still evaluate AI on time saved. That is natural. It is the easiest metric to measure, and it makes the ROI case straightforward. But it also locks companies into a narrow understanding of what the technology is for. If the best they can imagine is the same work done 30% faster, they will build infrastructure that is good enough for acceleration and nowhere near good enough for the deeper changes that create real value.
That is why so many copilots disappoint. They often do deliver local gains, but they rarely change the shape of the work. Companies add tools without redesigning the workflow around them, which means they get pockets of productivity rather than systems that compound over time.
The more important question is not whether AI saves time. Sometimes it does, and sometimes it takes more time at first. What matters more is whether it changes the timing, range, and structure of what the organization can do at all.
In many workflows, the problem is not only how much work people have to do. The problem is that the right information often arrives too late, moves too slowly, or is too hard to coordinate. When AI changes that, the value is not only speed. It is the ability to do work that the company previously could not do at the right moment.
That distinction changes what companies need to build. If AI is treated as an acceleration layer, then a chatbot may be enough. If it changes what a company can actually do, then the company needs new data flows, verification systems, orchestration logic, and feedback loops. So if you think about AI being a chatbot – that’s a software purchase. If you look deeper and expect a structural change, then it is a workflow redesign.
Shraddha Sridhar thinks very clearly about it, she described three levels of deployment:
Individual productivity, where one engineer uses an agent to move faster.
Team-level scaling of the same pattern.
Capability expansion, and that is where things get more interesting. Her example was a power insights agent that did not simply speed up an existing task. It changed when important information became available. Data that used to arrive too late to influence the chip now showed up months earlier, when engineers could still act on it. That is a different workflow with a different outcome.
And when Jensen Huang is talking about OpenClaw and that everyone should have "an “openclaw strategy” – that’s exactly what it is about. The point is not that every company needs one more assistant sitting on top of the old system. The point is that once AI becomes a real operational layer, the workflow itself becomes the design problem.
What AI workflow redesign actually looks like
Across domains, the architecture of successful AI systems is finally starting to look less mysterious and more consistent. Let’s look closer at it →
First, strong systems do not throw everything into the prompt. They narrow the context. They pick the right documents, the right history, the right signals, and bring them in at the right moment. More context is not always better. Very often it makes the output worse.
At GTC, David Loker, VP of AI at CodeRabbit, was saying that their code review system works not because a giant model sees the whole codebase, but because a smaller model spends most of its effort pulling the right context: cross-repo history, old pull requests, coding rules, team conventions. As he put it, too much context kills quality. The problem is not giving the model more. The problem is giving it what matters.
Second, good systems make answers traceable. Skilled people do not trust polished output on its own. They want to know where it came from, what evidence it used, and what assumptions it made. In high-stakes domains, it’s crucial. NVIDIA’s chip-design story made this very clear: their system only started working once engineers could inspect and verify the answer instead of just receiving it.
Third, the best systems keep domain experts involved. The people closest to the work usually know which documents matter, which knowledge is outdated, and which edge cases will break the system. If they are left out, the whole thing gets brittle fast. The engineers can not be passive users – they have to shape the knowledge layer itself. Engineer’s taste is still what decides the outcome.
Fourth, strong systems capture feedback in a structured way. That is the difference between a useful tool and a system that improves over time. If the user can only say “wrong,” not much happens. If they can explain what was wrong, what was missing, and how it should have worked, that feedback can improve retrieval, policy, or training. In that sense, this is a very old idea. Norbert Wiener built cybernetics around feedback, control, and communication. Companies now need to do the same. They should be teaching the system all the time.
This is the missing middle in a lot of enterprise AI discussion. People talk about frontier models on one side and end-user applications on the other. But a lot of the real value sits in between: context selection, verification, orchestration, and feedback loops. The Self-Coding Agents session at GTC pointed to the same shift in broader terms. We have moved from prompt engineering to context engineering, and now increasingly toward harness engineering, meaning the whole structure around the model that determines whether it can actually work in the real world.
Why the middle layer in AI workflows can't be skipped
A lot of companies still want to jump straight to autonomy. They go from scattered personal use to big plans for multi-agent systems running complex workflows with very little supervision. The ambition – pumped by hype – is understandable, but most of them don’t fully realize that the middle we just described simply can not be skipped.
And the real sequence is simpler than many maturity frameworks make it sound.
First, the system has to be trusted.
Then it can be used in real workflows.
Only then can it take on more responsibility.
Many companies are particularly inclined to skip the first part because it is slow, operationally messy, and not very demo-friendly. But without it, there is no real flywheel. There are only pilots, point solutions, and decks full of future plans.
Shraddha Sridhar suggested three questions that are more useful than most maturity charts:
How are you building trust with your users?
How are you preparing for feedback, especially expert feedback that still lives in people’s heads?
And how are you capturing and using that feedback?
If a company cannot answer those questions, it does not have a path to autonomy. Not yet.
Why AI workflow redesign is a management problem
This is also why the current AI transition reaches far beyond engineering and into management itself.
As systems take on more routine execution, the scarce asset shifts. The bottleneck moves away from raw task completion and toward specification, boundary-setting, exception handling, evaluation, and taste. The ability to define what should happen becomes more valuable when the ability to execute becomes more abundant.
That has consequences for how firms organize power.
People who are closest to the actual business problem often become more strategically important because they can specify what matters, identify edge cases, and evaluate whether the system is solving the right problem. Product judgment, operational judgment, and domain judgment start to matter more, not less. Engineering remains essential, but its role increasingly includes building the substrate through which those judgments can be operationalized safely and repeatedly.
This is where vague thinking becomes expensive. Human organizations have historically tolerated ambiguity because humans are good at filling gaps. A manager gives rough direction, a skilled employee interprets intent, and the work still gets done. Machine systems are much less forgiving. Ambiguity does not disappear inside them. It becomes failure, hallucination, drift, the workflow remains brittle.
So the organizational challenge is not merely to deploy AI. It is to become more explicit about how the company works, what quality means, where authority lives, which exceptions matter, and how correction flows back into the system.
That is a painful exercise because it reveals how much of the firm was never formalized in the first place.
GTC had a few moments that hinted at this broader recomposition of roles.
The things we'll be missing from agents for a while is kind of real agency of deciding what should get built... Deciding what are the right things to build that matter in the world are the most important pieces.
That observation travels well beyond software. As execution becomes cheaper, deciding what matters becomes more important. Mira Murati made the same economic point from another angle: the cost per unit of intelligence is falling, but the demand for intelligence keeps rising. Once intelligence becomes more available, the bottleneck moves upstream toward specification. Cheap intelligence does not eliminate management problems. It exposes them.
Conclusion: The real divide in enterprise AI
For a while, the main divide in AI looked technological: who had the best models, the most compute, the strongest research team, or the best data.
That still matters. But another divide is coming into view, and for many companies it may matter more. The real divide is between companies that can make themselves legible to machines and companies that cannot.
The old frame, where AI is treated as a tool for making existing work faster, is starting to run out of road. It follows the same logic as the factory that installs a new motor in the basement and leaves the rest of the system untouched. That approach can produce local gains, but it leaves much of the larger opportunity untouched as well.
A more useful frame treats intelligence as infrastructure. Once you see it that way, the question changes. The question is no longer only where AI can help. The harder question is what shape the organization has to take in order to use it well.
The companies that move ahead will do three things:
Turn tacit knowledge into structured context.
They will make more of the company legible by turning scattered know-how, exceptions, and institutional memory into context a system can actually use.Turn that context into bounded, verifiable action.
They will build workflows where outputs can be checked, traced, and trusted enough to support real decisions and real work.Turn human correction into a feedback loop.
They will capture expert feedback in a form the system can learn from, so everyday use improves the system over time.
The next gains will belong to the companies that learn how to describe, structure, and redesign their own work before their competitors do.
Resources from Turing Post










