Skills data is not a taxonomy problem. It is a signal problem.

In April 2025, Microsoft launched its People Skills AI Inference Engine inside Copilot. Josh Bersin described it as something that could “radically change the HR tech landscape.” The logic is clean: self-reported skills are inaccurate, so if you infer skills from CVs, job titles, calendar activity, and work history – at scale, with AI – you get the data you’ve been missing.

You won’t. Not because the inference is bad. Because the question is probably wrong.


The problem isn’t the quality of the inference. It’s what’s being inferred from.

A taxonomy describes what a job requires. It does not tell you what a person can do. These are different things, and the gap between them is where most skills implementations fail.

BCG’s 2025 research on skills-based organizations found that implementations consistently stall in the translation from taxonomy to actual decision-making. Deloitte found that organizations tend to stop at skills libraries without connecting them to real outcomes – compensation, mobility, project assignment, succession. SHL identified that both self-reported and AI-inferred skills data “lack the rigor needed for precise decision-making.” A Fortune investigation from March 2026 found that 53% of employers who claim to use skills-based practices still lack standardized processes to apply them.

The standard response to these findings is: better data. More inference. Smarter AI. More granular ontologies.

But the failure isn’t that the inference is imprecise. The failure is that the data being inferred from – CVs, job descriptions, stated competencies – is a record of the past, shaped by how people describe themselves and how jobs were written. It is not a signal of current capability.


What a signal looks like

We have been thinking about behavioral signals since before we called them that.

When we built the recommendation engine for Palco Principal in 2007, the question was similar: how do you know what music a person actually wants, as opposed to what they say they like? Genre self-tags were useless. The signal was behavior – what someone listened to voluntarily, how often they returned, what they skipped. Collaborative filtering on actual listening patterns outperformed any taxonomy of stated preferences we could have built.

The logic applies to skills, and the failure mode is the same. A skills taxonomy is a genre self-tag system. It describes how people and jobs have been labelled, not what people are actually doing or capable of.

The signals that reveal capability are different:

  • Someone who returns voluntarily to a learning module outside a mandated curriculum is showing direction – not just completion.
  • Someone who earns peer recognition from colleagues in a different function is showing cross-domain relevance – not just role performance.
  • Someone who consistently operates above the difficulty floor in their current role is showing headroom – not just compliance.

These are not metrics you infer from a CV. They emerge from what people do when the choice is theirs.


What this means in practice

This is not an argument against taxonomies. You still need them – for job architecture, compensation banding, external benchmarking. The Insight222 2025/26 research found that only 9% of organizations integrate people data across functions meaningfully. The gap is not that taxonomy frameworks are wrong – it’s that they’re being asked to answer questions they weren’t built to answer.

If you want to know whether a compensation band is right for a role, use a taxonomy. If you want to know whether someone is ready to lead a project in a domain they haven’t held a title in yet, the taxonomy won’t tell you. The behavioral signal might.

The practical consequence: organizations that want to actually know what their people can do need to create conditions where capability reveals itself through action – and then instrument that, not just classify it. A skills graph that only reflects stated and inferred credentials is a classification system. A system that also captures voluntary engagement, cross-functional peer signal, and performance trajectory is something closer to a capability map.

The distinction matters when the decision is real. “Who is ready for this transition?” is a question that a taxonomy answers slowly and badly. It’s a question that behavioral signals – if you’ve been collecting them – can answer with something closer to evidence.


What Microsoft’s launch actually signals

The People Skills AI Inference Engine is the clearest possible statement of the infrastructure optimist position: if we make inference good enough, we solve the problem. It’s a plausible bet when you have the data scale Microsoft has – LinkedIn profiles, Teams activity, calendar patterns, email metadata.

But scale amplifies the instrument, not the question. If the question is “what does this person’s profile say they can do,” better inference gives you a better answer to that question. If the question is “what can this person actually do, and how do we know” – the instrument is still pointed at the wrong data.

We’re not arguing against AI inference as a tool. We’re arguing that the organizations that will actually know what their people can do are the ones that treat behavioral signal as primary data – not as a supplement to the CV.

That’s a different architecture decision. And it’s one the taxonomy debate hasn’t reached yet.