The quarterly AI review slide deck shows green across every dimension. Operations, data, governance, enablement — all four returns sit within five points of each other, all in the upper bands. And yet the service desk has not promoted a single AI use case into production in the last eighteen months.
Nothing in the room asks for attention. The dashboards are clean. The score report is balanced. The steering meeting is polite. A data-quality concern surfaces, gets nodded at, and is deferred to the next cycle. The knowledge-base refresh has been "in progress" for three quarters; nobody in the room is sure who currently owns it. The classifier accuracy number on the slide has not changed in six months, which the team reads as stable. Nobody is panicking. Nobody is shipping either.
The unsettling thing about a programme this calm is how often it goes with a team that cannot say what is broken.
Two ways of looking at the same scene. Both worth holding at once.
The first lens comes from outside ITSM entirely. In The Five Dysfunctions of a Team, Patrick Lencioni names Absence of Trust as the foundation of a stalled team. He defines trust narrowly and usefully: trust is vulnerability-based — the willingness, inside a working group, to be open about your weaknesses, your mistakes, and the things you do not know. When that willingness is missing, members of the team protect their reputations rather than support the team's work. Nobody asks for help. Nobody admits they were wrong. Nobody says "I don't fully understand this part." The team performs a quiet competence theatre.
Lencioni stacks the dysfunctions as a pyramid. Trust is the base; the four other dysfunctions — fear of conflict, lack of commitment, avoidance of accountability, inattention to results — sit on top of it and depend on it. A team that cannot be vulnerable cannot have honest debate, and so cannot make decisions that stick. The base of the pyramid is where everything else either holds together or quietly comes apart.
Patrick Lencioni, The Five Dysfunctions of a Team (Jossey-Bass, 2002).
The second lens comes from the Matrix42 AI Maturity Assessment, and it points at something else entirely. The assessment names four areas of an AI rollout. Operations is whether AI is helping handle tickets, automate repetitive work, and improve the metrics the service desk reports on. Data is whether AI has access to good ticket and configuration data, and whether that data is structured well enough to support useful answers. Governance is whether the team has guidelines, oversight, and compliance for the AI systems it has deployed. Enablement is whether AI scales — beyond IT, in reusable templates, in patterns other teams can pick up.
These four dimensions describe what the team built. They are not a measure of how the team works together. The maturity assessment and Lencioni's framework are looking at two different things — one names capability, the other names team dynamics — and a stuck rollout is rarely about only one of them.
Both worth examining. Neither one substituting for the other.
Take the 5-minute AI maturity assessment
The Matrix42 AI ITSM Maturity Assessment is a five-minute, sixteen-question self-assessment that measures four areas of an AI rollout — operations, data, governance, and enablement. Each area returns a score from 0 to 100, sorted into one of five labels: Exploring (0–20), Piloting (21–40), Operational (41–60), Optimizing (61–80), or Intelligent (81–100).
Two lenses, not one diagnosis. The maturity profile above is what the assessment measures: what your team built. It is not a measure of how your team works together. Lencioni's framework is the second lens, and we present it in parallel because both are worth examining. We do not claim the scores reveal the dysfunction, and you should not read them that way.
When teams describe a culture in which weaknesses are hidden — the data quality nobody admits is mediocre, the AI pilot whose targets nobody quite committed to, the model nobody is comfortable saying they don't fully understand — we sometimes see a maturity profile that is flat and uniformly decent. All four dimensions tend to land in the 60s, with no dimension visibly weaker than the others.
The flatness itself is what catches the eye. Real capability is rarely uniform. Teams that have built honestly almost always show an uneven profile — one clear strength, one clear weakness, two dimensions in the middle. A profile with all four dimensions within five points of each other, in the mid-60s, is unusual on its own terms. It is worth a second conversation about what the team actually shipped.
A flat, balanced maturity profile in the mid-60s is unusual. It is worth a second conversation about what your team actually shipped.
A worked example, illustrative rather than empirical. A respondent answers the sixteen questions as follows:
Q1–Q16 responses: 4, 3, 4, 3, 3, 3, 3, 4, 4, 3, 3, 4, 3, 3, 3, 3
|
Dimension |
Score |
Maturity band |
|
Operations |
67 |
Optimizing |
|
Data |
66 |
Optimizing |
|
Governance |
69 |
Optimizing |
|
Enablement |
63 |
Optimizing |
All four dimensions in the Optimizing band. The recommended-action engine has little to surface beyond mild Optimizing-tier nudges — which is the wrong prescription if the underlying issue is something the assessment cannot see.
What this is not. A flat mid-60s profile does not mean your team has Absence of Trust. It is a profile shape we sometimes see in conversations about Absence of Trust. The same shape can come from a young programme that is genuinely balanced and unspectacular, from a mid-sized team that has built carefully across all four dimensions, or from a single respondent who happens to be evenly informed about every part of the programme. Two readings, taken with two different instruments, of the same situation — neither one is the other's evidence.
Take the 5-minute AI maturity assessment
If your team's profile looks like this, here are five questions worth taking to the next leadership meeting. They are not diagnostic — they are conversations worth having alongside the score.
Q8 follow-up — about data quality. "When was the last data-quality audit, and what did it find?" The Q8 question on the assessment reads "Our data is structured and reliable enough to support effective AI use." If the team rated this 4 but the answer to the audit question is vague or rounds to "we haven't really done one recently," the high rating is worth a second look. Frame this as a conversation, not a verdict. The point is not to catch anyone out; the point is to find out what the rating sits on top of.
Q12 follow-up — about transparency. "Walk me through one AI decision from last week — why did the model classify that ticket the way it did?" Q12 reads "AI decisions and recommendations are transparent." If nobody on the team can answer the walk-through with a specific ticket and a specific reason, the high rating is sitting on weaker evidence than the score implies. A team that has worked through transparency has a story to tell about it; a team that has rated transparency high in the abstract often does not.
Q3 follow-up — about measurable outcomes. "Which service metric has improved, by how much, with what baseline?" Q3 reads "Service metrics have measurably improved since implementing AI." Specifics or no specifics is itself information worth surfacing. A team that can answer with first-call resolution moved from 62% to 71% over the last two quarters, baseline measured in this specific way, is in a different place from a team that can only describe the improvement as real but unquantified.
Q16 follow-up — about templates. "Which template did your team adopt from another team this year?" Q16 reads "New AI use cases can be deployed quickly using reusable templates or patterns." A blank answer here matters. Adopting another team's template is a vulnerability move — it requires saying, in public, that another team built something better. Teams that have not done it once in twelve months may be telling you something about how they work, not just about their templates.
Team-health prompt (not pinned to a Q). "When was the last time someone on this team said 'I don't know' or 'I got that wrong' in a working meeting?" If the room cannot easily produce an example, the answer is itself the most useful thing on the table.
The score plus the discussion guide are the two things worth holding here. One names what was built. The other names the conversations that build it further.
One Lencioni-literate move. The vulnerability-based-trust intervention Lencioni recommends is the simplest one in his book and one of the hardest to actually run: a Personal Histories or Team Effectiveness Exercise with the AI steering team, in a room, with the leader going first. "I signed off on the last three knowledge-base overhauls and none stuck, and I don't know why" — said by a VP, in front of the team, opens a conversation the data-quality audit cannot. The exercise does not solve the data-quality problem. It changes the kind of room the data-quality conversation can happen in. It is paired work, not substitute work.
One assessment-prompted move. At a flat mid-60s profile shape, the recommended-action engine surfaces an Optimizing-tier nudge for the data dimension — typically a data-quality audit across the four areas that feed Q5–Q8, with a named steward, a published cadence, and real dates on it. The audit is useful in its own right. It is also likeliest to land if the trust conversation has happened first; in a room where it is not yet safe to say "the data is mediocre and I have known this for a year," another data-quality audit will produce another polite document.
Take the 5-minute AI maturity assessment
What does an AI ITSM maturity assessment measure? It measures capability self-report across four areas — operations, data, governance, and enablement — and returns a score from 0 to 100 per area, sorted into five maturity labels (Exploring, Piloting, Operational, Optimizing, Intelligent).
Is a high AI maturity score a good sign? Not always. A flat, balanced profile in the mid-60s is unusual on its own terms; teams that have built honestly almost always show an uneven profile, with a clear strength and a clear weakness. A perfectly even profile is worth a second conversation about what specifically the team has shipped.
How often should we retake the assessment? Once a quarter is the cadence we tend to find useful. The value of a re-take is in changing what you bring to it — inviting a peer who would score it more harshly than you, taking it independently, comparing the two reads, and treating the differences as the conversation.
Is my organisation ready for AI in ITSM? Readiness is a conversation, not a number. The assessment names what was built; the discussion guide above names the conversations worth having alongside it. The two together are more useful than either on its own.