This website uses cookies

Read our Privacy policy and Terms of use for more information.

This article is part of our The Org Age of AI series and is co-written by Will Schenk (TheFocus.AI) and Ksenia Se. You can read the first episode AI Feels Powerful. So Why Is the ROI Still Missing? in The Unsexy Truth of AI Adoption #1.

Before we jump into the next episode, we’d like you to check this interview with Sanja Fidler. Sanja works at the frontier of autonomous vehicles and robotics, and is one of the leading voices in spatial intelligence and physical AI. What is brewing in her lab now is likely a preview of what the rest of us will be using in the future. This episode is worth watching for everyone who wonders what kinds of architectural and multimodal breakthroughs we still need to achieve before robotics and autonomous vehicles become truly usable.

Episode #2 of The Org Age of AI: The Unsexy Truth of AI Adoption

Spend a few days in San Francisco right now and you will start believing that AI has already taken over everything. The city is literally blanketed with ads about AI and everything even remotely related to AI. It is in every café conversation, every event, every product pitch, every hiring deck. It creates a very powerful illusion: that the rest of the country is moving at the same speed, and that the world is basically one clean deployment plan away from autonomous organizations.

It is not.

Most companies across the US are still at a much earlier stage. For many of them, AI still means ChatGPT for writing, Copilot for code, meeting summaries, maybe a small internal experiment, and a lot of vague pressure to “do something with AI.” And we are not even touching the rest of the world here, where the picture is even more uneven.

Reading Twitter might give you the anxiety that you are terribly behind and everyone else is already operating in some futuristic AI economy where machines do the work and humans relax. They do not. You still have to work, and in many cases, you have to work more. So today, we are going to talk about what actually has to happen inside a company for AI to become operational: what a company needs to know about its own work, how little of that knowledge is actually usable by machines, and what it takes to change that. With real use cases. We will also discuss how AI is compressing the distance between enterprises and small companies.

What’s in today’s episode?

  • What does the path to AI maturity actually look like?

  • What most companies actually want

  • Why the middle cannot be skipped

  • L1 to L2: Making the organization legible to itself (hardest transition)

  • L2 to L3: Trusting your own data (most underestimated transition)

  • L3 to L4: The system starts acting on what it sees. Human’s role changes (most overhyped transition)

  • L4 to L5: The system changes how the organization works (most profound transition)

  • What this is actually about

  • Large enterprises and very small teams – is the approach different?

AI Maturity Levels: From Tribal Knowledge to Self-Improving Systems

There is no shortage of maturity models. Gartner has one. McKinsey has one. Deloitte has one. Most of them place organizations on a neat timeline: you are here, then you move there.

But AI maturity is not just a sequence of stages. It is a stack of dependencies. Each layer rests on the one below it. You cannot build the fourth layer if the second one is unstable. You can pretend otherwise, of course. Plenty of companies do. That is how you get pilots that look impressive in a demo and then disappear within six months.

What we want to describe here is not a timeline but a ladder of organizational capability. The lower rungs are familiar: tacit knowledge, scattered experimentation, isolated productivity wins. The upper rungs get all the hype: adaptive systems, agents, self-improving workflows. The real pain sits in the middle, where a company has to make itself explicit enough to be understood by a machine, trustworthy enough to be acted on, and structured enough for judgment to move to the right place.

That middle is where deployments either become real or die.

What most companies actually want

Here is the pattern: A product leader watches a demo of an agent completing a multi-step workflow. Maybe it reads documents, synthesizes findings, and drafts a report  with a thoughtful recommendation. Maybe it resolves support tickets end to end, taking a support ticket, cross referencing that with what's in the database, doing a coding task, pushing a fix to production and communicating to the user. The demo is real. The capability exists. 

The immediate response: we need this.

Then the company looks inward and the picture is different. Processes run on habit and improvisation. Critical knowledge lives in two or three people's heads. Like an informal fee in a questionable economy, knowledge is "corrupted" because its under the table, not above board and above all not visible.  It's hoarded for protective reasons.  Data systems use different naming conventions because different teams built them at different times for different reasons. The org chart says one thing about how decisions get made; reality says another.

Most companies want to go from scattered ChatGPT use directly to agents and autonomy. The middle layers – the ones where deployments either become real or die – get skipped in the planning.  They are the critical ones to build upon.

Why the middle cannot be skipped

Nobody builds a dramatic keynote around normalizing cost codes. They should though! Because that is exactly where the real drama lives.

AI maturity is cumulative. Each level gives the organization a new capability, and that capability reveals something about the organization that was previously invisible. The revelation forces a reassessment. Then the next level becomes possible.

This is not a linear climb. Different departments sit at different levels. Engineering might be at L3 while finance is at L0. Marketing moves fast with content generation while compliance lags a full level behind. The unevenness is normal. Governance almost always trails deployment.

The question is not "what level is our company?" It is "where are the structural gaps, and which ones are blocking us?"

Here is the stack:

Level

Name

The organization...

L0

Tribal

Runs on tacit knowledge and habit

L1

Experimenting

Uses AI individually, but nothing compounds

L2

Legible

Can describe its work in a form a machine can act on

L3

Knowledgeable

Knows what it knows, and can prove it

L4

Adaptive

Acts on signals before being asked

L5

Self-Improving

Learns from its own operation

The levels themselves are descriptive. The real story lives in the transitions.

L1 to L2: Making the organization legible to itself

This is the hardest transition in the entire framework.

Companies at L1 often look more advanced than they are. Someone uses ChatGPT for writing. Someone else uses Claude or Copilot for code. A third person built a clever internal assistant that works well enough to impress leadership and badly enough that nobody wants to maintain it. Some of this work is genuinely useful. The problem is that it does not compound. It remains personal, brittle, and undocumented. When the employee leaves, the workflow often leaves with them.

The move to L2 is not about choosing better tools. It is about the organization learning to describe its own work. What are the actual rules for processing an invoice? Not what the policy document says – what do people actually do? Which naming conventions does each supplier use? When someone uploads a file for review, what is the real approval chain – not the org chart version, the one that actually happens?

We worked with a bookkeeping company focused on food service processing dozens of invoices weekly for different clients from different suppliers. They wanted AI to automate the data entry – we have both PDFs and images from cell phones, hand written notes on the invoice where things were accepted or changed. As an OCR exercise this was straightforward enough technical challenge and the new models are up to the task.

Some suppliers put fuel service fees into soft costs, while others put bottle deposits in there, how are these factored into calculating unit price?  How do you handle weight-based versus unit-based pricing?  Which suppliers issue updates with a new invoice number invalidating old ones, and which ones reissue changes under the invoice number, and if so how do your determine which is the current?  Six weeks of work before any AI could happen! – because the business process had never been made explicit. The humans had been absorbing ambiguity that a machine could not.

And once the system forced clarity, we actually started seeing fewer "exceptions" coming through the suppliers.  As the light pushed out the darkness, fewer games were being played on their other side slipping in expenses that were previously overlooked.

At a construction firm, we inherited a data sync system that had been built over a year by a single developer – 224 commits of working logic. When that developer became unavailable, the entire system broke. Cost code mappings lived in one person's head. "Plumbing" had been renamed to "15.1 PLUMBING" in the accounting system, and only one team member knew the translation.  

Once you normalize that data, you can start asking much more useful questions. Can the system detect when something is over budget? Can it flag when burn rates look wrong for a specific cost code? Demolition, for example, should burn down mostly at the beginning of a project, while finishing work should ramp up toward the end. But in this case, large discrepancies kept showing up, and it turned out project managers were playing all sorts of games by moving money from one bucket to another. They were not changing the final number. They were managing client expectations by delaying bad news until some other part of the project was going well and everyone was in a better mood to hear it. None of that logic is visible to the machine.

This is where companies discover something uncomfortable: a great deal of institutional knowledge has never been written down, and some of that is not accidental. When the rules live in someone's head, that person is indispensable. When the process is undocumented, nobody can question whether it makes sense. Making work legible means making it inspectable, and that is a form of vulnerability – for humans:

Recording meetings so they become searchable records. Documenting exception rules. Cleaning data into structured formats. Defining what "good" looks like so you can evaluate whether a machine did it right. This is the work of L2. It does not look like AI. The output is a spreadsheet of mappings and a document that explains what terms mean. Writing something unspoken down can uncover uncomfortable truths.  But without it, everything above collapses.

What this transition is misunderstood as: "We need an AI strategy" or "We need to pick the right tools." The blocker is that the company cannot describe its own workflows. This is where pilot purgatory lives – companies start pilot after pilot, each works in isolation, none connect, nothing accumulates.

What makes it worth doing anyway: everything you build to make the organization legible to machines also makes it better for humans. Onboarding gets faster. Bus factor drops. The organization becomes more resilient. The work is not overhead on the way to AI. It is good organizational hygiene that AI forces you to finally do. For teams that want to accelerate this process with structured, hands-on guidance, Claude workshops by DataNorth are designed exactly for this transition — from scattered individual use to repeatable workflows.

L2 to L3: Trusting your own data

This is the most underestimated transition. This is where most companies discover that connecting data is the easy part. Trusting it is harder. →

At L2, the organization has described its workflows and connected its tools. AI is plugged into real systems – CRMs, accounting software, project management platforms. Data flows between systems instead of being manually copied.

The move to L3 is connecting AI to your proprietary data – the databases, documents, and records that make your business unique. Building the interfaces (API integrations, sync pipelines, MCP servers) so that AI can work with your actual information instead of generic knowledge.

Is that sounds like an engineering problem to you – you are wrong. It is a trust problem.

Where did the data actually get sourced from?  When things start to get plugged together the internal data is messier than anyone admitted.  That dashboard that everyone is keying off of?  Turns out that it's only reporting domestic sales information since overseas is coming in on from a different system, and now all of a sudden the numbers are different.  Maybe more accurate, but jarring for everyone.

When you connect systems, you discover that internal data is messier than anyone admitted. A media analytics firm found that three platforms used different identifiers for the same people, which they knew, but didn't realize that they were double counting things. The same public figure appeared under different names and IDs across systems. Nobody had built a canonical mapping because humans just pattern-matched in their heads. 

Connecting the data forced the company to confront how fragmented its own knowledge actually was – and how much of that fragmentation reflected organizational boundaries rather than logical ones.

At a financial services firm, we validated an AI invoice parser against 88 historical invoices that had been manually entered. The system achieved 94.5% accuracy. But the interesting finding was not the number. Many "discrepancies" turned out to be errors in the manually-entered data, not in the AI's parsing. The machine was catching mistakes the humans had made. That inverted the trust question: the system was more reliable than the process it was replacing.

Even so, the team would not have trusted the system without being able to check its work line by line. Trust at L3 is not about accuracy percentages. It is about verifiability. We discussed this in our guide to AI workflow redesign and enterprise ROI with NVIDIA's chip design team their fine-tuned model worked, but engineers refused to use it until every answer was traceable to a source document. The system only gained adoption when it became auditable.

What this transition is misunderstood as: "We need RAG" or "We need more data." The blocker is not connecting the data. It is that nobody will act on the answers until they can verify them. Source attribution, audit trails, and the ability to inspect the system's reasoning step by step – this is the infrastructure of trust, and most teams underestimate how much of it they need.

What makes it worth doing: institutional knowledge becomes queryable. New hires can ask questions that previously required finding the right senior person. Questions that took 45 minutes of digging through files get answered in 5 seconds. The organization starts to know what it knows.

L3 to L4: When AI Systems Start Acting Without Being Asked

This is the most overhyped transition.

At L3, the organization's data is connected and queryable. People ask questions and get verified, sourced answers. The system is trusted within defined boundaries.

The move to L4 is the system acting on incoming signals without being asked. It watches logs, monitors events, processes incoming requests, and either handles them or surfaces them with context.

At Tezlab, a companion app that tracks vehicle data, we built a support system that pulls down incoming emails, queries the production database and the codebase, and diagnoses the issue before a support engineer looks at it. The human still decides what to do. But by the time they see the ticket, the system has already identified the likely cause, pulled the relevant account data, and checked whether the issue matches a known pattern. This is not a chatbot. It is an investigative layer that runs ahead of the human.

It identified everything from transient upstream API issues, to data load errors, to the customer having a different expectation of the feature.

That same system was later extended to monitor infrastructure logs. It watches Kubernetes deployments, identifies resource inefficiencies, and proposes optimizations. Hosting costs dropped 20%. The system went from reacting to support tickets to proactively identifying operational improvements nobody had asked about.

The reason this transition is overhyped is that everyone describes it as "deploying agents," which makes it sound like a technology purchase. Again – the actual difficulty is organizational.

When a system surfaces insights proactively, the people receiving those insights need the authority to act on them. A support engineer who can now see the root cause of an issue – diagnosed by the system from production logs and code – is in a position to make what would previously have been considered an engineering change. The customer service team, equipped with AI that understands the database, starts resolving issues that used to require a developer. That is a power shift. The org chart has to accommodate it, and org charts do not change easily.

What this transition is misunderstood as: "We need agents." Many firms reaching this point still lack stable workflows, exception logic, and reliable evaluation. They want autonomous execution on top of a foundation that cannot support it. For that to work, they have to “clean the mess” and take the L1 to L2 to L3 route.

What makes it worth doing: the organization becomes faster and fairer. People closest to the problem – support engineers, operations coordinators, project managers – get direct access to the information and tools they need. The bottleneck shifts from information access to judgment, which is where human value actually lives.

L4 to L5: The system changes how the organization works

This is the most profound transition.

At L4, the system is reacting to signals, surfacing insights, and enabling people across the organization to act on information they could not previously access. The question becomes: does the system learn from all of this activity?

The move to L5 is closing the feedback loop. The system does not just execute – it improves based on how humans respond to its outputs. Expert corrections flow back into the system's behavior. Usage patterns inform what the system prioritizes. The organization's collective judgment becomes encoded in the system over time.

We built a system that monitors project activity – code commits, ticket velocity, scope changes – and compares it against the original plan. When it detects a divergence, it reports it AND proposes adjustments: revised timelines, updated job descriptions for roles that need to be filled, changes to the hiring pipeline based on what the project actually needs versus what was originally scoped. The system is making recommendations about organizational design based on what it observes about how work is actually happening.

This is where AI becomes a true layer of organizational design.

The trust question here is the deepest one. The organization has to decide whether it is willing to let a system influence how it operates. Can it actually change workflows, suggest structural adjustments, redistribute work? Who decides what the system is allowed to learn? What happens when its recommendation conflicts with how things have always been done? What happens when the system is right and the established practice is wrong?

These are management questions. They are not engineering questions. And they are the reason L5 remains rare – with such a powerful technology, most organizations are still not prepared to cede that kind of influence to a system they built.

What this transition is misunderstood as: "We need autonomous agents." The real blocker here is human: expert correction stays in people's heads instead of flowing back into the system. The support engineer fixes the issue but the system does not learn from the fix. The analyst adjusts the recommendation but the adjustment disappears. Closing that loop is the work.

What makes it worth doing: competitive advantage compounds. Every interaction makes the system slightly better. Institutional knowledge stops being fragile – it is captured, structured, and continuously refined. The organization's intelligence becomes durable rather than dependent on who happens to be in the room.

Why AI Adoption Is an Organizational Problem, Not a Technical One

There is a thread running through all four transitions that we want to name directly.

The work required to make an organization legible to machines is the same work that tends to make it better for the humans inside it. Searchable records help the model and the employee who missed the meeting. Source attribution helps the AI system and the analyst who wants to inspect the reasoning. Better mappings, cleaner data, clearer exception rules, and more distributed access to context improve machine performance and reduce human friction at the same time.

That is one reason the Org Age of AI matters. We are going through institutional redesign.

For years, companies could postpone this work because the pressure was not high enough. AI raises the pressure. It forces organizations to become more explicit about how they operate. Some of the resistance to enterprise AI is really resistance to that exposure. Once the process becomes visible, it can be challenged. Once the data becomes comparable, it can reveal errors. Once decisions become traceable, power has to justify itself a little more clearly.

That is why this ladder is organizational before it is technical. The model matters. The infrastructure matters. The tools matter. The harness matters. But the real question is whether the institution is willing to make itself knowable enough for machines to participate in the work in a serious way.

Large enterprises and very small teams – is the approach different?

One observation about where this leaves different kinds of companies.

Large enterprises face these transitions as organizational surgery. Each level requires exposing something that was previously hidden, redistributing something that was previously hoarded, or formalizing something that was previously improvised. The transitions are hard because the organization has mass. It’s also a lot of times a mess. The potential here is enormous – thousands of people operating with better information, clearer processes, and more distributed authority.

Very small teams face a different version of the same problem. A five-person company does not need to document tribal knowledge because there is no tribe. It does not need to redistribute authority because everyone already has it. Small teams can reach L3 or L4 quickly because they do not carry the organizational debt. But as they grow, they will hit the same walls unless they build the legibility infrastructure early.

AI is compressing the distance between these two. It is forcing large organizations to become more legible and adaptive. It is giving small organizations the leverage to operate at a scale that used to require hundreds of people. The companies that win will not be the ones that sounded the most advanced in 2026 or spent the most money on billboards in San Francisco. They will be the ones that did the patient, load-bearing work of becoming organizations that AI could actually help run.

We will unpack what real AI adoption looks like for each of these types of institutions in separate episodes in this series. Stay tuned.

If you want us to evaluate what step of the ladder you’re on, and tell you honestly what is missing before AI becomes operational in your company →

Reply

Avatar

or to participate