Semantic layers used to feel like BI hygiene.
Define metrics once. Stop dashboards from disagreeing about "active user." Give analysts a cleaner interface over warehouse tables. Useful work, but still somewhere near the presentation layer.
Agents changed the job.
Once a model starts using data to answer questions, recommend actions, or trigger workflows, the semantic layer is no longer dashboard polish. It is execution infrastructure.
The bug was not just the answer
One failure mode that stuck with me involved reimbursement data.
A reporting agent answered a question without properly accounting for approval status. The answer looked plausible. That was the dangerous part. Because the underlying financial system used double-entry accounting, the mistake caused reimbursement charges to be offset incorrectly.
It was tempting to describe this as the model "missing context."
That is true, but too shallow.
Approval status was not extra context. It was part of the meaning of the transaction. If the system lets an agent reason about reimbursements without making that meaning hard to miss, the semantic model is underbuilt.
The model did not need a nicer prompt. It needed a better interface to truth.
Raw schemas are hostile interfaces
Point a language model at a warehouse schema and it has to guess too much:
- which tables matter
- how entities join
- which timestamp means the business event
- whether a status is terminal or intermediate
- whether a metric is filtered, netted, or adjusted
- which records the user is allowed to see
Humans guess too, but humans often have surrounding knowledge. They know the weird table everyone avoids. They remember the Slack thread where finance changed the definition. They look at a number and think, "that seems off."
Agents do not reliably do that. They produce a plausible answer and keep moving.
That is why semantic layers matter more now. They encode meaning before the model has a chance to improvise.
The model handles intent. The layer handles truth.
The useful framing is translation.
On one side: governed, structured, permissioned data.
On the other: fuzzy natural-language intent from a user or agent.
The semantic layer turns "reimbursement spend last quarter" into the right entities, joins, filters, permissions, and definitions. Not because the model guessed well, but because the system made the correct path available and the incorrect paths harder to take.
This is also why a markdown file full of metric definitions is not enough. It gives the model something to imitate. It does not enforce anything.
Cube makes this point in its writeup on semantic SQL: the guarantees need to live below the model. The model should not be responsible for remembering every join and policy boundary from prose.
The requirements got heavier
When a human is reading a dashboard, the system can sometimes survive a little ambiguity. The human can notice a strange number, ask a follow-up, or ignore the chart.
When an agent consumes the answer and acts on it, the tolerance is lower.
The semantic layer now needs infrastructure properties:
- explicit relationships
- permission enforcement
- metric versioning
- lineage
- operational monitoring
- latency and availability guarantees
- test cases for business definitions
That list is not BI polish. It is runtime behavior.
If agents depend on a semantic layer to make decisions, then the layer belongs in the same mental category as APIs, queues, and databases. It should be versioned, observable, and boring in production.
Reliability work often starts below the model
A lot of AI reliability issues are semantic modeling issues wearing a model-shaped hat.
The agent joined the wrong tables because relationships were implicit. It returned the wrong number because two metrics shared a name. It leaked data because governance lived in application code the model could route around. It was right last week and wrong this week because a definition changed without versioning.
Better prompts help. Better models help. Evals help.
But if the system has not encoded the domain, the model is being asked to reconstruct the business from table names and partial clues.
There is some evidence for this. A paired benchmark, summarized by Cube here, found that adding a semantic layer improved correctness far more than swapping between frontier models in the same tier.
That matches the shape of the problem. You cannot prompt your way out of an ambiguous data model.
Agents made the semantic layer more important by making mistakes more expensive. The old job was to help people read the map. The new job is to stop the system from driving off it.