Why Semantic Interoperability needs AI
In 2025, SEMIC explored how artificial intelligence (AI) techniques could help public sector organisations find, understand, and reuse semantic data models more easily. Previous SEMIC work had already shown the value of adapting Large Language Models (LLMs) for specific domains. Building upon this foundation, this new effort looked at how far those improvements could go when combining domain adaptation with knowledge-graph retrieval-augmented generation. In the context of semantic interoperability, AI can help by making semantic assets more accessible, easier to search, understand, and reuse. This ensures that the general goal of semantic interoperability
The study looked at how a set of typical questions, similar to what data modellers and ontology engineers might ask, were answered. It evaluated three different setups of answering questions about semantic data. Two simpler approaches, namely regular RAG-based LLM and basic LLM, exposed recurring issues like lack of specificity or relevance. While these two approaches primarily served as reference points to highlight common limitations, the most valuable insights emerged from leveraging GraphRAG. The GraphRAG approach connected an LLM to a knowledge graph and enabled it to retrieve information through structured queries, which significantly improved the quality of chatbot responses.
The lessons presented in this article are focused around the GraphRAG results and its contribution to interoperability. As semantic interoperability ensures that information exchanged across borders and systems keeps its meaning, AI can help by making semantic assets more accessible, easier to search, understand, and reuse.
Lessons learned
These insights are meant for anyone considering using GraphRAG or similar techniques.
-
Don’t solely rely on keyword-based retrieval for querying semantic assets. Systems that match text by similarity often return results that seem relevant at first glance but miss the real intent of the question.
-
Replace assumptions with data-driven results. GraphRAG succeeded because it relied on actual results from a knowledge graph, not general assumptions.
-
Transforming natural language questions into structured queries remains a challenge. As questions become more complex, small errors in query construction can lead to empty or irrelevant answers. We highly recommend implementing a validation mechanism to validate query structure against the schema, ensuring queries are always syntactically correct.
-
Consider a GraphRAG approach when accuracy matters. LLMs often make assumptions, but we can reduce this by refining context and restricting where the LLM searches for answers. GraphRAG reduces ambiguity by grounding responses in precise, retrieved data rather than relying on semantic similarity across broad document collections.
In the end, GraphRAG delivered the most correct and complete answers across all questions (overall performance of 66.7%, nearly double the scores of the basic LLM and standard RAG, 34%). It successfully used attributes specific to the Linked Open Vocabularies dataset, showing real schema awareness. It generated syntactically valid SPARQL queries even for complex tasks. It supported clearer reasoning and disambiguation between concepts.
Performance also varied depending on how questions were phrased. As questions grew more complex, the system sometimes over or under constrained filters, or missed parts of the user’s intent.
Why does this matter?
If you work in public services, research, or policy, these advances mean you will soon be able to find, share, and reuse data more easily saving time, reducing errors, and helping Europe work better together.
Conclusion
The experience from this study suggests the following:
-
Prepare example questions and verified SPARQL templates to guide query generation.
-
Build a growing library of resolved queries and reuse them whenever similar patterns appear.
-
Expect to refine SPARQL queries iteratively.
-
Use a knowledge graph when your domain involves complex, interconnected relationships, as this improves query accuracy and reasoning capabilities.
-
By implementing a schema aware validation step, we can ensure syntactically correct queries. And thus, move from generic and irrelevant answers to more accurate data driven results.
The main takeaway of the study is that grounding an LLM’s answers in structured graph data can significantly improve response accuracy and usefulness. GraphRAG demonstrated this potential convincingly. With further refinement such as implementing a user feedback loop and a richer repository of query patterns, it could enable even more reliable data-driven public services across Europe.
Curious to learn more? The full SEMIC report provides detailed methodology, evaluation results, and practical recommendations for harnessing AI to achieve true semantic interoperability in the European public sector.