Toolkit

SEMIC has compiled a list of third party tools to allow public administrations to bootstrap their journey to become increasingly interoperable in the EU. Each tool in this list responds to a practice for increasing semantic interoperability such as modelling your data according to a specification, managing controlled vocabularies, compliance with said specifications, and the harvesting, validating and publishing of your data.

Additionally, SEMIC offers the SEMIC Tooling Assistant. This interactive wizard allows the user to find a specific tool based on a small set of questions related to the task that is to be performed.

Modelling data specifications

Data specification is one form for organisations to agree on what the data exchanged are. Specifications are a prescribed form of generating data for a consumer audience. For the consumer audience, data specifications contain information on how to process retrieved data. Data specification is a form of social contract between data producers and consumers on how to best exchange data. In this sense data specifications are one core element to enable interoperability among all parties.

SEMIC proposes a basic tooling architecture that can be found on this page. In addition to this reference architecture, the page contains a recommended tool for each architectural component.

In addition to the various semantic specifications, SEMIC offers the Style Guide validator. The Style Guide is a document that defines stylistic rules that are applied to the SEMIC specifications. These rules and guidelines include naming conventions, syntax, artefact management and organisation and can be applied to any semantic specification. To facilitate the automation of this process SEMIC has developed the Style Guide Validator which is available as a service on the Interoperability Test Bed. This service allows the validation of models in UML, SHACL and OWL against the SEMIC Style Guide. Additional documentation on the Style Guide Validator can be found here.

Click here to be redirected to the UML Style Guide Validator, or click here to be redirected to the SHACL/OWL Style Guide Validator. Both validators are also available as a REST API or SOAP API.

Managing controlled vocabularies

Successful interoperability is also based on eliminating the ambiguity of referring to an entity in a data space. This is normally achieved by maintaining authoritative naming lists, or controlled lists of terms, sharing glossaries or thesaurus. It is highly beneficial that all the parties can refer to a term with unambiguous definition and understanding, and it plays well for organisations interoperability. The Publication Office, for instance, has a long-standing effort of maintaining EUROVOC together with many other terminologies, to keep all the units of the EC that publish institutional content, aligned around the same critical terms.

Authoritative naming lists, or controlled vocabularies and thesaurus need dedicated governance of their editorial lifecycle, in order to keep long term value. Below a list of tools can be found that help in editing, or validating SKOS code lists.

Name	Description	Input	Licence	Skills	Interface
Atramhasis	Atramhasis is an online SKOS editor. This webapplication enables users to create SKOS vocabularies consisting of Concepts and Collections.	SKOS	GNU GPL 3		GUI
Ginco	GINCO is a free software developped by the Ministry of Culture and Communication (France) and is dedicated to the management of vocabularies.	SKOS	CeCILL 2		GUI
iQvoc	Can import, display, publish and manage vocabularies and ontologies that are in SKOS format.	SKOS	Apache License 2		GUI
Re3gistry	The Re3gistry software is a reusable open-source solution for managing and sharing ‘reference codes’ through persistent URIs. It can export in HTML, ISO 19135 XML, JSON, RDF/XML, ATOM, Re3gistry XML, CSV, ROR.	SKOS	EUPL 1.2		GUI
Tematres	TemaTres is an open source vocabulary server, web application to manage and exploit vocabularies, thesauri, taxonomies and formal representations of knowledge.	SKOS	GNU GPL 2		GUI
VocBench	VocBench is a web-based, multilingual, collaborative development platform for managing OWL ontologies, SKOS(/XL) thesauri, Ontolex-lemon lexicons and generic RDF datasets.	SKOS	BSD-3 Clause	Linked data	GUI

Name	Description	Input	Licence	Skills	Interface
qskos	qSKOS is a tool for finding quality issues in SKOS vocabularies.	SKOS	GNU GPL 3	Linked data	CLI
Skos Testing Tool	The SKOS testing tool is a web frontend for qSKOS. It allows to assess the quality of SKOS or SKOS-XL vocabularies, by submitting a file to be validated or by validating a SKOS file published at a given URL.	SKOS	GNU LGPL 3	Linked data	GUI, API
Skosify	Skosify is a program that accepts a thesaurus-like vocabulary expressed as RDFS, OWL or SKOS as input and produces a clean SKOS representation, which attempts to represent the input data losslessly using SKOS best practices.	SKOS	MIT	Linked data	CLI, API
VocBench	VocBench is a web-based, multilingual, collaborative development platform for managing OWL ontologies, SKOS(/XL) thesauri, Ontolex-lemon lexicons and generic RDF datasets.	SKOS	BSD-3 Clause	Linked data	GUI

Modelling data according to SEMIC specifications

Semantic interoperability is largely based on agreeing on the meaning of exchanged data, and how data specifications are the instrument that allows to formalise these agreements. Specifications naturally lead to a defined structure for data to be published.

Your data undergo a set of intermediate transformations, normally originating in a format other than the one prescribed by a specification.

Learn about available data specifications published by SEMIC.

In the spirit to serve reusable solutions to common problems, SEMIC outlined a workflow that can be followed to effortlessly produce data compliant with CPSV-AP. The workflow for CPSV-AP can be found here.

Harvesting data from online catalogs

SEMIC makes a commitment to increase interoperability by adhering to data specifications. This is most evident in scenarios where administrations expose compliant data from online catalogues. Base registries and public services description represent one instance where administrations enable data consumers to effortlessly plug into their online catalogues via SEMIC connectors. On one end an administration is providing the latest version of the data, consumed on the other end by a second administration. The first administration makes the publication of data and connectors only once, while the second administration only needs to adopt the connectors to trigger interoperability.

Below a list of tools can be found that can facilitate harvesting, and in which pipelines can be built.

Name	Description	Input	Licence	Skills	Interface
LinkedPipes ETL	Low code tool that can cover the entire harvesting process (ETL).	Local files, FTP, HTTP Get, SPARQL Endpoint	MIT	SPARQL	GUI
Sparql Anything	SPARQL Anything is a system for Semantic Web re-engineering that allows users to query anything with SPARQL.	JSON, HTML, CSV, XML, Binary, TXT, Markdown, File system and archives (ZIP, Tar), XLS, XLSx, DOCx, EXIF Metadata, Bibtex, YAML	Apache License 2	SPARQL	GUI, CLI
Sparql-Generate	SPARQL-Generate is an expressive template-based language to generate RDF streams or text streams from RDF datasets and document streams in arbitrary formats.	RDF, SQL, XML, JSON, CSV, GeoJSON, HTML, CBOR, plain text with regular expressions, MQTT or WebSocket streams, repeated HTTP GET operations	Apache License 2	SPARQL	GUI, CLI

Validating compliance with a specification

Interoperability requires rigorous actions to ensure faithful adherence to the data specifications, and that the administrations keep exchanging data effortlessly. SEMIC action provides support to verify that data on the way to being exchanged, are compliant to the specifications. SEMIC offers a set of validators for CPSV-AP, DCAT-AP and BRegDCAT-AP data specifications to its community of adopters on the Interoperability Test Bed.

Additionally, below a list of third party tools that can validate data against a data model are provided.

Name	Description	Input	Licence	Skills	Interface
ITB-shacl-validator	The SHACL validator is a web application to validate RDF data against SHACL shapes.	RDF	EUPL 1.2	Linked data	GUI, API
Jena-shacl (as part of Jena)	Jena-shacl is an implementation of the W3C Shapes Constraint Language (SHACL). It implements SHACL Core and SHACL SPARQL Constraints.	RDF	Apache License 2	Linked data	CLI
Linked Pipes ETL	Low code tool that can cover the entire harvesting process (ETL).	RDF	MIT	Linked data	GUI
LinkedDataHub	LinkedDataHub is open source software you can use to manage data, create visualizations and build apps on RDF Knowledge Graphs.	Import from CSV, RDF	Apache License 2	Linked data	GUI, API
Ontoseer	OntoSeer is a tool that monitors the ontology development process andprovides suggestions in real time to improve the quality of the ontology under development.	RDF	Apache License 2	Linked data	GUI
pySHACL	This is a pure Python module which allows for the validation of RDF graphs against Shapes Constraint Language (SHACL) graphs, including SHACL Advanced Features and SHACL-JS Features.	RDF	Apache License 2	Linked data	CLI, API
RDF Playground	RDF Playground allows web users to write RDF as Turtle, check its syntax, visualize the data as a graph, and use SPARQL, RDFS, OWL, SHACL and ShEx.	Turtle	Apache License 2	Linked data	GUI, CLI
RDFUnit	RDFUnit is a test driven data-debugging framework that can run automatically generated (based on a schema) and manually generated test cases against an endpoint, it can validate against OWL and SHACL.	RDF	Apache License 2	Linked data	CLI
ROBOT-validate	ROBOT is a command-line tool and library for automating ontology development tasks, with a focus on Open Biological and Biomedical Ontologies (OBO).	OWL	BSD-3 Clause	Linked data	CLI
ROBOT-verify	ROBOT is a command-line tool and library for automating ontology development tasks, with a focus on Open Biological and Biomedical Ontologies (OBO).	OWL, SKOS	BSD-3 Clause	Linked data	CLI
Shacl Play !	Shacl Play ! can validate the conformity of an RDF dataset against a SHACL specification.	RDF	GNU LGPL 3	Linked data	GUI
Shacl Playground	Based on its library can validate against SHACL but not SHACL-Sparql constraints	RDF	GNU AGPL 3	Linked data	GUI
Shaclex	Can validate against SHACL and Shex.	RDF	MIT	Linked data	CLI
Turtle Validator	Syntax validation.	Turtle and Ntriples documents	MIT	Linked data	GUI, CLI
VocBench	VocBench is a web-based, multilingual, collaborative development platform for managing OWL ontologies, SKOS(/XL) thesauri, Ontolex-lemon lexicons and generic RDF datasets.	OWL	BSD-3 Clause	Linked data	GUI

Publishing interoperable data

Publishing documentation on a data model or linked data is crucial to enhance interoperability. It provides a shared understanding of data structure, format, and semantics, enabling different systems and applications to interact cohesively. This not only enhances data integration and reuse but also promotes transparency and citizen engagement by making public data more accessible and understandable.

Below a list of third party tools can be found that facilitate the publication of documentation on data and data models.

Name	Description	Input	Licence	Skills	Interface
Jekyll RDF	A Jekyll plugin for including RDF data in your static site.	SPARQL endpoint	MIT	Jekyll for the setup	CLI
JOD	Generates documentation web pages from Ontology turtle documents. Based on jekyll and jekyll-rdf plugin.	Ontology as .ttl	MIT	Jekyll for the setup	CLI
LODE	Tomcat server application that can be used to create HTML documentation for OWL ontologies.	OWL	ISC license	Linked data	API
Ontoology	OnToology will survey OWL files and produce diagrams, a complete documentation and validation based on common pitfalls. It also offers seamless publication of user ontologies with w3id using GitHub pages.	OWL	Apache License 2	Linked data	CLI

Click here to access the SEMIC GitHub.

Report abusive content Share

Toolkit

Modelling data specifications

Managing controlled vocabularies

Click here to view tools for editing SKOS code lists.

Click here to view tools for validating SKOS code lists.

Modelling data according to SEMIC specifications

Harvesting data from online catalogs

Click here to view tools for harvesting.

Validating compliance with a specification

Click here to view tools for validating data.

Publishing interoperable data

Click here to view tools for publishing documentation.