Skip to main content

MI3: Provenance

Anonymous (not verified)
Published on: 01/02/2016 Discussion

The DCAT model treats the descriptions of datasets in a catalogue as entities that only exist in the context of the catalogue, and does not consider situations where these descriptions are imported from and exported to other catalogues.

In an environment where descriptions of datasets are exchanged among data portals, the situation that DCAT-AP is designed for, it may be important for users to understand where data comes from and how it may have been modified along the way. For example, it could support credibility of a dataset to know which organisation created the metadata for it and how the description was modified along a chain of exchanges.

DCAT-AP specifies an optional property dct:provenance for Dataset but does not provide any guidance on how to describe instances of the class dct:ProvenanceStatement.

A common approach the expression of provenance would improve interoperability among catalogues.

 

This issue has been reported by Sadia Vancauwenbergh:

http://joinup.ec.europa.eu/mailman/archives/dcat_application_profile/2016-January/000364.html

Component

Documentation

Category

improvement

Comments

Makx DEKKERS
Makx DEKKERS Thu, 11/02/2016 - 14:08

Provenance is a concept that is commonly defined by dictionaries as ‘place of source of origin’ and more specifically ‘The history of the ownership of an object’ See for example: http://www.thefreedictionary.com/provenance

DCAT-AP v1.1 has several properties that are related to provenance:

  • Property source metadata (dct:source), optional, non-repeatable property for Catalogue Record, that refers to the original metadata that was used in creating metadata for the Dataset
  • Property source (dct:source), optional, repeatable property for Dataset, that refers to a related Dataset from which the described Dataset is derived.
  • Property provenance (dct:provenance), optional, repeatable property for Dataset, that contains a statement about the lineage of a Dataset.

It would be useful to know how implementations use these properties and whether there are additional properties that are used for provenance-related information, for example from W3C PROV-O provenance ontology.

Anonymous (not verified) Wed, 17/02/2016 - 07:50

In our European Data Portal Geo-Harvesters we follow the approach defined in GeoDCAT-AP to map the ISO19115 attribute "lineage". The attribute can be found in ISO19115/ISO19139 following this XPath: /gmi:MI_Metadata/gmd:dataQualityInfo/gmd:DQ_DataQuality/gmd:lineage/gmd:LI_Lineage/gmd:statement/gco:CharacterString

The proposed candidate from DCAT-AP and GeoDCAT-AP is dct:provenance. Since the range of dct:provenance is not a literal, but class dct:ProvenanceStatement, the free-text content rdfs:label is used.

Anonymous (not verified) Wed, 16/03/2016 - 16:07

Proposed resolution:

  • dct:provenance is used in few cases for local purposes, e.g. with free text.
  • Usefulness in (international) harvesting is questionable and the information may be ignored.
  • Detailed provenance requirements may be satisfied with PROV-O (out of scope for DCAT-AP). 
Anonymous (not verified) Tue, 06/09/2016 - 18:19
Login or create an account to comment.