Diagnostic

The Institutional AI Diagnostic

Seven yes/no tests. A read on whether your AI program is actually working at the firm level — or just at the individual level.

Ryan Carmichael

Managing Partner, Orienteer AI

Most AI rollouts produce strong individual-productivity metrics and weak firm-level results. The reason is usually that the firm hasn't installed the disciplines that turn personal AI use into institutional capability. Below are seven yes/no tests. Answer honestly. The more “no” answers you have, the more your AI program is producing personal gains that aren't aggregating to anything your CFO can see.

The Seven Tests

Seven disciplines. Each one is a yes/no question about your firm's current state. Each card walks through what good and bad look like, what fails when the discipline isn't in place, why it matters, and what to do if you're failing.

Test 1

Ownership

“Can you name, by person, the human owner of every AI system currently running in your business?”

What “yes” looks like: You have a list. Every AI system on it has a name attached. The list is current within the last 90 days. If any of those owners left tomorrow, you'd know what each system does, what data it touches, and what to do with it.
What “no” looks like: The list doesn't exist, or it exists but it's stale, or the owners are listed by team rather than by name, or "the AI team" is the answer for everything.
The symptom when this fails: When someone leaves the company, AI systems they built stop working in ways that take weeks to figure out. Auditors and regulators ask questions that take days to answer. Nobody knows who approved what.
Why it matters: Most AI systems get orphaned by departure — they had a clear owner when they were deployed, then that person left and nobody picked up the responsibility. The system keeps running until it breaks. The cost of running orphaned systems is invisible right up until it isn't.
What to do if you fail: Build the inventory. Assign a named owner to every system. Document what happens when each owner leaves. This is a one-time exercise that takes a few weeks and prevents a recurring crisis.

Test 2

Governance

“Could you produce, on demand, an audit trail for any AI-driven decision made in your firm in the last 90 days?”

What “yes” looks like: You have a registry of AI systems, logs of what each one did, records of who approved each deployment, and a process for retiring systems that are no longer needed. A regulator asking "how did this decision get made?" gets a real answer in hours, not weeks.
What “no” looks like: You'd need to email three different teams, dig through Slack history, ask the original builder (if they're still there), and probably reconstruct half of it from memory.
The symptom when this fails: Regulatory exams turn into fire drills. Customer disputes about AI-driven decisions can't be substantiated. Internal audits surface findings that the firm can't explain.
Why it matters: Individual AI use is allowed to be non-deterministic and unaudited because the stakes are personal. Institutional AI doesn't have that luxury. The moment AI is touching customer-facing decisions, financial decisions, or regulated decisions, governance becomes a compliance requirement, not a nice-to-have.
What to do if you fail: Stand up a registry. Wire logging into every production AI system. Define what an audit trail needs to contain. This is the operational discipline that separates a firm using AI from a firm where AI is part of the operating model.

Test 3

Coordination

“Do your AI workflows share context, prompts, and outputs across teams — or is every analyst running their own workflow in isolation?”

What “yes” looks like: When one team figures out a useful prompt or workflow, the next team uses it without re-deriving it. AI outputs flow into other AI systems and into the firm's institutional knowledge. The benefit of AI adoption compounds across the org rather than being captured individually.
What “no” looks like: Every analyst has their own ChatGPT habit, their own prompting style, their own outputs that don't talk to anyone else's. Two teams are doing identical AI work without knowing it. The firm's collective AI knowledge lives in individual Slack DMs and personal histories.
The symptom when this fails: Hidden duplication of effort. Wildly different output quality from team to team. The same problem getting solved a dozen times in parallel. Senior leaders unable to see how AI is actually being used across the firm.
Why it matters: A firm with a thousand individual AI users is not the same as a firm with a thousand-person AI capability. The first is a collection of personal productivity gains. The second is an institutional asset. Coordination is what turns one into the other.
What to do if you fail: Build the shared infrastructure — prompt libraries, output repositories, cross-team standards for how AI gets used. Treat institutional AI knowledge as a first-class asset that the firm owns, not a personal asset that each employee carries.

Test 4

Depth

“Is the AI capability you're paying for specific to your business — or is it generic capability that every competitor in your industry has bought from the same vendors?”

What “yes” looks like: Your AI knows your data, your decision rules, your domain language, your edge cases, your regulators. It performs measurably better on your work than a generic foundation model would. You're using the foundation models AND a domain-specific layer on top.
What “no” looks like: Your AI program is essentially a corporate ChatGPT license plus some training videos. You're using the same models, the same prompts, the same RAG patterns as every other firm in your industry.
The symptom when this fails: AI feels useful but doesn't show up as competitive advantage. Your team can do what your competitors' teams can do, at the same speed, with the same tools. You're paying for capability that's available to everyone.
Why it matters: Generic AI capability is a commodity. Competitive advantage lives in the part that's specific to your firm — your data, your workflows, your domain expertise. The foundation models are the table stakes; the depth layer is where edge accumulates.
What to do if you fail: Pick one workflow that matters to your business and invest in making AI genuinely good at it. Generic capability everywhere; deep capability where it counts. The combination wins.

Test 5

Adoption

“Have senior leaders in your firm actually changed how they work because of AI — or are they still asking analysts to produce the same reports the same way?”

What “yes” looks like: Senior leaders are using AI directly, redesigning their own workflows around it, and pushing their teams to do the same. Middle managers have reorganized their teams' work around what AI now makes possible. Frontline staff trust the outputs and use them in decisions.
What “no” looks like: AI is something the IT team or the analysts are doing. Senior leaders nod approvingly and don't change their own habits. The org chart and decision-making processes look identical to what they were before AI showed up.
The symptom when this fails: Strong tool adoption metrics, weak organizational change. Users are logging in. The firm isn't operating differently.
Why it matters: The technical work of deploying AI is easy compared to the organizational work of getting people to actually change. The firms that produce real firm-level value treat adoption as a deliberate discipline — with named owners, sequenced rollout, training, measurement, and a playbook for senior-leader resistance. The firms that don't, watch their AI investment turn into shelfware.
What to do if you fail: Identify the workflows where AI should fundamentally change how decisions get made. Get senior leadership behind the change-management work directly — not delegated. Sequence the rollout so wins compound. This is consulting work, not software work.

Test 6

Anticipation

“Has any AI system in your firm ever surfaced something important that nobody thought to ask about?”

What “yes” looks like: AI systems are continuously watching operational data, detecting patterns, and surfacing risks or opportunities that wouldn't have come up in any human-asked query. A claims pattern that suggests fraud. A customer trajectory that suggests churn. A covenant breach that's three months out. A bid opportunity nobody knew existed.
What “no” looks like: Every interaction with AI starts with a human prompt. The AI only does what it's asked. The value is capped at the imagination of whoever's prompting.
The symptom when this fails: AI is making humans faster at things they were already doing. AI is not doing anything humans weren't already doing. The financial value caps out at "time saved."
Why it matters: Reactive AI — AI that only acts when prompted — is a productivity tool. Proactive AI — AI that watches, detects, and surfaces unprompted — is an institutional capability. The firms that build the second type will operate on a different cadence than the firms that don't.
What to do if you fail: Identify the data streams that are continuously generated in your business. Identify the questions nobody has time to ask about them. Build the systems that ask those questions automatically. This is where the institutional AI advantage lives.

Test 7

Impact

“Can your CFO see the financial impact of your AI program in current-period reporting, in language that ties to revenue, cost-to-serve, customer retention, or risk?”

What “yes” looks like: AI program reporting shows up in board materials. The metrics are dollars, not hours-saved. The CFO believes the numbers because they trace back to specific business outcomes that can be audited.
What “no” looks like: AI reporting is "productivity gains" and "license utilization" and "user satisfaction." None of those numbers show up in any P&L. The CFO has stopped asking what AI is producing because the answers don't help her do her job.
The symptom when this fails: Budget conversations get harder every year. The board starts asking why AI investment isn't producing measurable results. Eventually someone proposes cutting the AI budget and nobody can articulate the right counter-argument.
Why it matters: Time saved is a vanity metric. Most of it isn't actually saved (people fill the time with other work), and even when it is, it doesn't roll up to anything the CFO can use. Institutional AI has to be measured in CFO language: revenue moved, cost reduced, risk avoided, customers retained. If your AI program isn't producing those numbers, it's not yet an institutional capability — even if every employee is using AI every day.
What to do if you fail: Pick three financial metrics that AI should be moving. Build the measurement infrastructure to show whether it's moving them. Report in CFO language, not in productivity-tool language. Without this, every other discipline in this list eventually gets defunded.

What to do with your score

All 7 yes

Institutional capability

You have an institutional AI capability and you should be writing this guide instead of reading it.

4 or more yes

Ahead of most firms

Pick the failing pillars and prioritize them — the gap between "mostly there" and "fully institutional" is where competitive advantage compounds.

2 or 3 yes

Typical for most firms today

AI is present in your firm, but it's still individual productivity rather than institutional capability. The work ahead is the redesign work — the same kind of organizational redesign every prior corporate technology cycle eventually required. It's not glamorous, but it's the work that turns AI investment into firm-level value.

1 or none yes

At the start

Most firms are. The good news is that institutional AI capability can be built deliberately — it's a discipline, not an accident — and the firms that build it first will have a real and durable advantage over those that don't.

Walk through your answers with us

The diagnostic above is what we use in the discovery phase of every Orienteer engagement. If you want to figure out what the practical next steps look like for your firm, that's what an Adoption Sprint is for.