We've been working on semantic representation for decades—knowledge graphs, ontologies, semantic layers. Jessica Talisman untangles what they actually are and what AI needs from them.
Really strong framing of the gap between measurement and meaning.
What stood out is the assumption that meaning can be fully defined upfront, first in metrics, now in ontologies. In practice, especially in identity, meaning is ambiguous and context dependent.
Two records sharing an address could be the same person, a household, or bad data. The answer only emerges when you evaluate surrounding signals and how they interact. That is not something a metric or even a fully defined ontology can resolve on its own.
Feels like the next step goes beyond representation. Systems need to learn from context and evaluate competing evidence, not just model it. Curious if you see ontologies evolving in that direction or remaining a foundation for learning systems.
Graph algorithms and reasoners handles the dynamic aspect of ontologies-the reasoning, constraints and support for auditability. In knowledge engineering, it’s common to have an auditing framework and HITL —augmentation before automation. Context lives in the relationships between things-the space in between. The framework for capturing data, where meaning is ambiguous and context dependent, you must first have an existing knowledge infrastructure to support disambiguation processes in order to capture the meaning and relationships between aka context. I write about this extensively on my Substack ;)
Totally agree that a knowledge infrastructure is required as the foundation. Without structure, there is nothing to reason over. Where I think it gets interesting is in heterophilous cases. In identity, for example, strong signals often conflict. Records that should match can look very different, while similar records can be different people. Structure and rules alone do not always resolve that cleanly.
The system has to evaluate competing evidence across relationships and context, not just rely on defined semantics. That is where I see learning systems complementing the ontology. The ontology defines the space, but meaning is resolved within that space through context and evidence.
I have been writing about a similar idea from the identity side, how resolution emerges from neighborhoods rather than individual records.
Really like how you framed context as living in the relationships.
Appreciate that, I will take a look. Senzing does some interesting work in that space.
Identity ends up being a bit of a proving ground for a lot of these ideas. The ambiguity and conflicting signals force you to move beyond simple matching pretty quickly.
Lovely read up , currently I am involved in the energy market and CIM Models totally agree with the thoughts shared , i am new to the field but it is motivating me to read more about Ontology !!
Semantic layers are also not new, they have existed since the late 80s or even before. What brought them to the forefront is AI. The real question is if AI can do for Ontologies what it did for semantic layers. And also we may consider the possibility of LLMs being good enough with Semantic Layers and not needing Ontologies or Context Graphs (for most domains).
Do you have some sources for people who know nothing formal about ontology, semantics, knowledge representation etc. that can act as a primer for someone trying to understand the fundamentals behind context engineering? Besides your post, it feels like most things I read are attempts at some GUT for context engineering; I want the foundations. Love your posts!
I point towards semantic infrastructures like my Ontology Pipeline framework for organizing. This includes taxonomies, metadata schemas and ontologies.
Thank you for a well written article. Semantic layers, as you rightly point out, are primarily for standardizing business term (metric) definitions and SQL generation from those definitions. Semantics are way beyond limited metrics definition and semantic layer is really a misnomer.
What is needed is an Ontology layer - one that can help unify data across domains and help analyze them by reasoning about them (not just simply querying over them as in BI use cases).
Such an ontology layer (we at Aabhra call it the Sarvam ontology - Sarvam in Sanskrit stands for "everything") goes beyond just describing entities, their attribute and properties and relationships across entities.
From experience and research we know that data across domains have intrinsic genetic properties - we call them DataGenes(TM). These DataGenes(TM) provide deep insights into any data and belong in the ontology layer. AI can be used to infer the genetic properties of any data quickly, as can be done using our Aabhra AI Inference engine.
Now, this wholesome context (entity definitions, attributes & properties, relationships AND intrinsic genetic properties of data) is leveraged dynamically by an LLM to understand complex enterprise class analytic intent expressed in NL. The Aabhra AI Analytics platform leverages this LLM to execute, at scale, these enterprise class analysis by reasoning over the data in the context of the ontology layer.
We can actually prevent meaningless operations like adding the salary of an individual to the GDP of a country - something mathematically feasible if currencies are the same, but semantically wrong, by levering ontological definitions of data.
Expecting semantic layers to grow into ontology layers is like expecting to start by building a bicycle and adding jet engines to it to make it into a plane. Any claims that they aid AI in reasoning over data may be true in a very myopic context, at best.
Check out a few posts in LinkedIn on what we have done at Aabhra.
Really good and timely article. Lots of confusion in the market currently in terms of what "semantics" actually means and whether or not the "semantic layer" solves AI's context problem - the answer is clearly NO, but lots of companies are only starting to think about this and articles like this are sorely needed in those discussions.
I find it really interesting that the yamls and what have you in the "semantic layer" do in fact contain *some* semantics, in terms of what a field means and how metrics are calculated. The problem is that this semantic information is basically isolated and siloed (often not even reused between two solutions on the same platform). The "semantic layer" meant for BI use needs this information for sure, but it is certainly not where this information should live. We need to find ways to connect the BI semantic layer with the REAL semantic layer of the whole enterprise.
Great article, Jessica! Thorough review of hot topics! Love your on-point breakdown of semantic layers in regard to real, machine-readable semantics.
I beg to differ a bit on your initial illustration tho. If I'm going to be difficult--reasoning capabilities lies within an software implementation (as Hermit, Pellet, etc), and as far as I know, context graphs are merely a vague description of another ontology concept rather than an specific implementation? ;-)
Oh I totally agree that the reasoning happens in reasoners and implementation. I think the conversations about context graphs have lost sight of the very essence of what we are describing. Some folks are describing context graphs as an implementation detail to try and differentiate context graphs from RDF/semantic knowledge graphs. 😊
This is really interesting and similar to a concept that I’ve been researching - the semantic core - a governed model of organisational intent that occupies a specific and previously empty position between strategic intent and operational execution. It argues that the absence of this artefact is the structural cause of enterprise transformation failure, and that making it explicit changes what is possible for any organisation attempting to evolve. My arguments for this are not related to AI but for better business outcomes. So this dimension is very interesting.
Great read that summarizes spectrum of semantic layer overall. Getting to knowledge graph is where the power ultimately lies for business provides reasoning from self service layer. Without ontology and KG it’s mostly headless semantic layer which is okay to start with but will fall short quickly.
I absolutely agree with this. I wouldn’t call it pushback. I write and speak about this quite a bit. The need for institutional capacity. Orgs are not configured to support ontology and knowledge work.
Really strong framing of the gap between measurement and meaning.
What stood out is the assumption that meaning can be fully defined upfront, first in metrics, now in ontologies. In practice, especially in identity, meaning is ambiguous and context dependent.
Two records sharing an address could be the same person, a household, or bad data. The answer only emerges when you evaluate surrounding signals and how they interact. That is not something a metric or even a fully defined ontology can resolve on its own.
Feels like the next step goes beyond representation. Systems need to learn from context and evaluate competing evidence, not just model it. Curious if you see ontologies evolving in that direction or remaining a foundation for learning systems.
Graph algorithms and reasoners handles the dynamic aspect of ontologies-the reasoning, constraints and support for auditability. In knowledge engineering, it’s common to have an auditing framework and HITL —augmentation before automation. Context lives in the relationships between things-the space in between. The framework for capturing data, where meaning is ambiguous and context dependent, you must first have an existing knowledge infrastructure to support disambiguation processes in order to capture the meaning and relationships between aka context. I write about this extensively on my Substack ;)
Totally agree that a knowledge infrastructure is required as the foundation. Without structure, there is nothing to reason over. Where I think it gets interesting is in heterophilous cases. In identity, for example, strong signals often conflict. Records that should match can look very different, while similar records can be different people. Structure and rules alone do not always resolve that cleanly.
The system has to evaluate competing evidence across relationships and context, not just rely on defined semantics. That is where I see learning systems complementing the ontology. The ontology defines the space, but meaning is resolved within that space through context and evidence.
I have been writing about a similar idea from the identity side, how resolution emerges from neighborhoods rather than individual records.
Really like how you framed context as living in the relationships.
I’ll check out your Substack-! Identity is a super interesting space.
I did do a project for Senzing that bled into the identity space you might appreciate:
https://jessicatalisman.substack.com/p/the-semantic-infrastructure-opportunity?r=ee6qm&utm_medium=ios
Appreciate that, I will take a look. Senzing does some interesting work in that space.
Identity ends up being a bit of a proving ground for a lot of these ideas. The ambiguity and conflicting signals force you to move beyond simple matching pretty quickly.
Good article, for 80% of industrial problems can be solved by semantic layer itself, With right domain knowledge and data KG might be helpful
Lovely read up , currently I am involved in the energy market and CIM Models totally agree with the thoughts shared , i am new to the field but it is motivating me to read more about Ontology !!
You know, maplib supports mapping your data accordingly to CIM as an ontology, straigt-out-of-the-box. ;) https://datatreehouse.github.io/maplib/maplib.html#Model.write_cim_xml
Carnap weighed in on this, along with others, before Moore had his epiphany. https://www.phil.cmu.edu/projects/carnap/editorial/latex_pdf/1950-1.pdf#:~:text=Realists%20give%20an%20affirmative%20answer%2C%20subjective%20idealists,on%20for%20centuries%20without%20ever%20being%20solved.
Semantic layers are also not new, they have existed since the late 80s or even before. What brought them to the forefront is AI. The real question is if AI can do for Ontologies what it did for semantic layers. And also we may consider the possibility of LLMs being good enough with Semantic Layers and not needing Ontologies or Context Graphs (for most domains).
Do you have some sources for people who know nothing formal about ontology, semantics, knowledge representation etc. that can act as a primer for someone trying to understand the fundamentals behind context engineering? Besides your post, it feels like most things I read are attempts at some GUT for context engineering; I want the foundations. Love your posts!
This is a great foundation article: https://jessicatalisman.substack.com/p/the-ontology-pipeline?r=ee6qm
I have a few more on my Substack, where I teach folks how to build a knowledge infrastructure.
You have a typo in your footer “fine me on LinkedIn”. Otherwise, an excellent treatise.
Love this article!
I'd love to hear what you think should be the baseline to create, keep, and organize this 'brain'?
I think .md files are kind of replacing Yaml files, but analytics teams found it hard to understand how and where to organize it.
I point towards semantic infrastructures like my Ontology Pipeline framework for organizing. This includes taxonomies, metadata schemas and ontologies.
A knowledge architecture for AI reasoning is what we need :-)
- Captures why processing is allowed, not just what happens
- Formal ontologies with machine reasoning
- AI agents query permissions with cryptographic proof
- Decision traces for every authorization
This is my stealth start-up Designed for the AI era.
Thank you for a well written article. Semantic layers, as you rightly point out, are primarily for standardizing business term (metric) definitions and SQL generation from those definitions. Semantics are way beyond limited metrics definition and semantic layer is really a misnomer.
What is needed is an Ontology layer - one that can help unify data across domains and help analyze them by reasoning about them (not just simply querying over them as in BI use cases).
Such an ontology layer (we at Aabhra call it the Sarvam ontology - Sarvam in Sanskrit stands for "everything") goes beyond just describing entities, their attribute and properties and relationships across entities.
From experience and research we know that data across domains have intrinsic genetic properties - we call them DataGenes(TM). These DataGenes(TM) provide deep insights into any data and belong in the ontology layer. AI can be used to infer the genetic properties of any data quickly, as can be done using our Aabhra AI Inference engine.
Now, this wholesome context (entity definitions, attributes & properties, relationships AND intrinsic genetic properties of data) is leveraged dynamically by an LLM to understand complex enterprise class analytic intent expressed in NL. The Aabhra AI Analytics platform leverages this LLM to execute, at scale, these enterprise class analysis by reasoning over the data in the context of the ontology layer.
We can actually prevent meaningless operations like adding the salary of an individual to the GDP of a country - something mathematically feasible if currencies are the same, but semantically wrong, by levering ontological definitions of data.
Expecting semantic layers to grow into ontology layers is like expecting to start by building a bicycle and adding jet engines to it to make it into a plane. Any claims that they aid AI in reasoning over data may be true in a very myopic context, at best.
Check out a few posts in LinkedIn on what we have done at Aabhra.
https://www.linkedin.com/posts/partha-n-267198_ai-native-semantic-analysis-using-aabhra-activity-7420143394493972480-FYyX?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAAG3G8BnNPV9WWPwn5ZM_UsomqDzdDW54w
https://www.linkedin.com/posts/partha-n-267198_reinventing-analytics-part2-activity-7222661163275202560-1Y5G?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAAG3G8BnNPV9WWPwn5ZM_UsomqDzdDW54w
Really good and timely article. Lots of confusion in the market currently in terms of what "semantics" actually means and whether or not the "semantic layer" solves AI's context problem - the answer is clearly NO, but lots of companies are only starting to think about this and articles like this are sorely needed in those discussions.
I find it really interesting that the yamls and what have you in the "semantic layer" do in fact contain *some* semantics, in terms of what a field means and how metrics are calculated. The problem is that this semantic information is basically isolated and siloed (often not even reused between two solutions on the same platform). The "semantic layer" meant for BI use needs this information for sure, but it is certainly not where this information should live. We need to find ways to connect the BI semantic layer with the REAL semantic layer of the whole enterprise.
This is where there is so much room for innovation, Juha. How to build bridges?
Great article, Jessica! Thorough review of hot topics! Love your on-point breakdown of semantic layers in regard to real, machine-readable semantics.
I beg to differ a bit on your initial illustration tho. If I'm going to be difficult--reasoning capabilities lies within an software implementation (as Hermit, Pellet, etc), and as far as I know, context graphs are merely a vague description of another ontology concept rather than an specific implementation? ;-)
Oh I totally agree that the reasoning happens in reasoners and implementation. I think the conversations about context graphs have lost sight of the very essence of what we are describing. Some folks are describing context graphs as an implementation detail to try and differentiate context graphs from RDF/semantic knowledge graphs. 😊
Yeah, I've noticed. 🙄 Putting make up on the pig, as we say in Norway.
Lipstick on the pig in the US 😊
This is really interesting and similar to a concept that I’ve been researching - the semantic core - a governed model of organisational intent that occupies a specific and previously empty position between strategic intent and operational execution. It argues that the absence of this artefact is the structural cause of enterprise transformation failure, and that making it explicit changes what is possible for any organisation attempting to evolve. My arguments for this are not related to AI but for better business outcomes. So this dimension is very interesting.
Great read that summarizes spectrum of semantic layer overall. Getting to knowledge graph is where the power ultimately lies for business provides reasoning from self service layer. Without ontology and KG it’s mostly headless semantic layer which is okay to start with but will fall short quickly.
I absolutely agree with this. I wouldn’t call it pushback. I write and speak about this quite a bit. The need for institutional capacity. Orgs are not configured to support ontology and knowledge work.
https://open.substack.com/pub/jessicatalisman/p/from-metadata-to-meaning-the-knowledge?r=ee6qm&utm_campaign=post&utm_medium=web
YAML configs are semantics to many orgs. Thank you for the comments ;)