What should the CPOV say about change eventes? What are the essential characteristics we should capture? Do we need to define sub classes of chage event? How can we offer something useful without defining an ontology of events (which would be a separate and large project).
Component
DocumentationCategory
feature
Login or
create an account to comment.
Comments
There are a few events that come to mind as being important:
For example, when tracking the budgets of departments over time, it's important to be able to track these changes in order to produce a rigorous analysis.
The problem of tracing the temporal evolution of a concept (including an organization) is neither trivial nor complicated task.
Making the following distinctions and answering the following questions should help clarifying the approach:
It gets more complicated than that. You intrinsically have two ways to identify a public organization :
In the first case, you see a public organization and define it, with lifecycle management, official naming, etc... You come acrosse questions such as : do we want to trace name changes? (for the record, I'm for delta's - not snapshots).
The second case gives a meaning to the public organization (is also an extra concept (skos:Concept?) in the vocabulary - cf. Registered Organizations vocabulary). The occurence of that same public organization in the legal framework has its own lifecycles that should be captures. Eg Organization Y (in essence private) gets controlled by central government due to sudden financial ownership (stocks being bought by public org's). So according to ESR2010, it becomes a public organisation. Two years later, policy changes and Org Y is privatized. Life as a public organization stops at that moment.
So we have to capture change events in the Organization itself as well as in its occurence in the legal framework.
This also applies to the issue about 'what is the core of an organization that we want to capture'. Location for me isn't what defines an organization. It is an important aspect, but relates to other cores (eg INSPIRE models). Time change capture should be on that relation also.
if we take a working definition of a change event as 'something occuring at a certain timestamp, or between a start timestamp and end timestamp, which affects a property or properties of an entity', then the basic properties of an event start to become clear - it needs to refer to a entity, it needs to refer to the property or properties which were affected, it needs to have a start and end timestamps (which may have the same value).
Given that a single change event on one entity can map to multiple change events on other entities (e.g. a change at EU level would map to change events on individual member states which may differ in the altered properties), then a hierarchy is also required. A change event therefore needs references to its parent change event and any child change events. This structure also allows sibling change events to be followed, via the parent.
Applying a restriction on the values of a change event then becomes related to the set of changeable properties on the object upon which the change occurs. It is therefore probably not a good idea to try to constrain those properties, other than in a low-depth abstract way - but as Phil points out, that needs to be carefully approached.
The question over delta or snapshot is largely a matter of storage, rather than model or vocabulary. The most useful representation is the current snapshot, a point in time representation can be either engineered from the original state plus the changes (which still requires a snapshot of the original state), or on each change the previous state can be end timestamped and stored. The only effect on the model/vocabulary is to enforce a start and end timstamp of each applicable entity, where the end can be blank (or null).
Completely agree with the part about the working definition.
With regard to the second part above, I'm not sure I'm understanding this right (please correct me if I'm wrong).
You want to capture change events on some attributes of an organization. Given that the events are triggered by some other event on a supranational level, you want to register the trigger event in your regional system? Or you want to register the regional change in your EU-system? Thus you need to import the 'other' event in your own system to do the mapping? Suppose you do so (which would prove the succes of the overall use of a core voc, by the way), you end up with a 'derivative' chain of change events.
On this point, you're creating a hierarchy - even ontology - of change events, rather than describing public organizations. Now I'm just trying to find the practical use for this.
The main reason of changes events for me is to be able to walk through the lifespan of an organization and eg. to correlate events that occured at different moments in it's lifespan.
Delta approach is suitable for modelling the state transition for a set of propoerties and the action/process that lead to the new state.
Snapshot approach is suitable for modelling a state defined as a set of propoerties(&values) that are the case in a given time interval.
We can say that, in a way:
In both cases, the representation requires two components:
Notes:
[1] the endurant vs. pendurant distinction is well documented in DOLCE foundational ontology.
http://www.loa.istc.cnr.it/old/DOLCE.html;
http://wonderweb.semanticweb.org/deliverables/documents/D18.pdf
I was taking an end user, data analysis view of the concept, so I'm not sure 'register the trigger event' is what I was aiming for - I was more just pointing out that the chain of change events does exist already and so the question we need to consider is something like:
"does the CPOV need to be able to describe the source of change events across public organisations, and if so to what level of detail and in what way"
for example, if the answer to the first part is 'yes', then a simple textual reference to an EU or National directive might be sufficient (although is not a very structured approach)
The use case would be being able to walk the lifespan correlating events on one organisation, but then also being able to determine where that change originated and how it affected other organisations which may not appear to be related to each other - but clearly whether that use case has any value or relevance in the CPOV context is open to question, as I do agree that this may be straying from a description of a public organisation.
Some thoughts via the example of ORG-AP-OP (EU Whoiswho):
The approach allows two ways:
1-"snaphots" properties (e.g. PrefLabel, for simplification of access and to be compatible with ORG)
2-for some properties (the one we want to keep history) eXtension for Labels (XL) values with a "start timestamp" and an "end timestamp" for the property (the granularity if a "day")
Therefore, usually historical data (an RDF file) represents a "period" between two dates (e.g. Commission organization chart during the whole year 2015). From this data, it is easy to "reconstruct" any state of the organization chart for all days of 2015. From this example, the concept of "event" is in fact the change of a state of a property. (E.g. a change of the prefLabel of an organization, the change of the "parent of the organization", the change of the definition of the organization)
Examples of properties we keep the history by defining a "xl" version
-skos:prefLabel (and also altLabel, hiddenLabel)
-org:hasSubOrganization (and also org:subOrganizationOf)
-skos:definition
Examples of properties we don't keep the history (because we don't consider there is an added value):
-dcat:contactPoint
-org:hasSite
In theory at least, those are readily handled by the existing concept of a change event and the organisation(s) the preceded and resulted from that change.
The Web is good at this :-)
I would define two URIs, one that identifies the current organisation, and one that identifies a dated snapshot. cf. how we do our specs at W3C. http://www.w3.org/TR/dwbp is the URI for the latest version of the data on the Web Best Practices document. If you want the snapshot then use the dated URI (http://www.w3.org/TR/2016/WD-dwbp-20160112/) which is different from the previous version (http://www.w3.org/TR/2015/WD-dwbp-20151217/). WE generally auto-generate a diff for the docs as well, so there's your delta *and* your snapshots ;-)
We can apply similar logic here. Writing in Turtle, the data might say:
<http://example.org/id/foo-2016-02-03> a org:Organization;
skos:prefLabel "The New DG Foo"@en;
org:resultedFrom ex:ev1;
{... stuff that doesn't change} .
<http://example.org/id/foo-2013-07-23> a org:Organization;
skos:prefLabel "DG Foo"@en;
org:resultingOrganization ex:ev1;
{... stuff that doesn't change} .
ex:ev1 a org:ChangeEvent
prov:startedAtTime "2016-02-03";
prov:endedAtTime "2016-02-03";
dcterms:description "Change of name"@en .
Then you have a triple that says
<http://example.com/id/foo> owl:sameAs <http://example.org/id/foo-2016-02-03>
and an HTTP redirect from the short URI to the dated one.
(these need updating when a new version is created of course).
If we store the deltas individually then to get the current inforation - which is what you probably want most of hte time - you'd have to start with the original version and then process each change one by one.
Trying to think along here (formerly active in BI-modelling with slowly changing dimensions etc... so another kind of logic...). Also relatively new in semantic or web-technologies, so still thinking primarily in db-terms (but busy finding my way in turtle, rdf, triples, ...)
This would presume storing all changes in time - each with regard to the last change - and then indeed processing every change until you come by your query.
This relates to 'what is core' in that you'll have to define what you want to capture (see prior discussions here).
My concern is that - in time - won't this lead to chaos. Let's say you have five attributes you want to capture change events on. For each of that, you register the prior state and generate a triple that links two states in time (either the original to each 'state' or either incremental). How do you cope with :
Noting that several uses of change event are discussed in https://joinup.ec.europa.eu/asset/cpov/issue/what-core#comment-17599
For what I understand, there are 3 possibilities for storing/keeping state/events along the time:
1- Record periodicly snapshot of "version of the data" (e.g. every day)
2- Record unfrequently snapshot of "version of the data" (e.g. every year) + a list of event (each change in data with the corresponding date) + keep the latest version of the data
3- Record data states with startdate and enddate for every change you may close the timeinterval of the previous value and create a new start of interval for the new value. + a simple inference can give the latest version of the data (usually the value with time interval without enddate).
Observations:
Solution 1 is easy to implement but would cost a lot in term of memory resources for storage and a lot of computational resources for extracting any events (need to perform comparisons of versions of the data).
Solution 2 is the one described by #10 Posted by Phil. It is less easy to use conceptually for human because it mixes the concepts of "states" and "events" witihin the data and so it would entail always complexity while jumping from an "event view" to a "state view" and vice-versa.
Solution 3 (post #8) allows an easy reconstruction of the "version of the data" at any time (simply a check which interval is in the requested time). Furthermore, this solution store the data in the most "compact" way. The extraction of event is easy (e.g. based on a date / based on a value (or the change of value).
My two cents on this,
I am not sure we are interested in a versioning framework, as this is probably application-specific, and, from what I gather, out of the scope of CPOV (defined as a collection of semantic assets). It is in fact a matter of storage, rather than conceptual modeling, as others have pointed out in this thread.
If this is indeed the case, then the differentiation between delta/diff-based and snapshot- (i.e. version) based approaches becomes irrelevant in how the vocabulary chooses to represent change events. Furthermore, defining a corpus of potential changes, be it a vocabulary of terms or a full-blown ontology, seems dangerous because these might also be context-specific and dynamic in nature (e.g. cross-country differences could lead to inconsistencies in the overall structure). Unless a few high-level abstract concepts are defined, that are ensured to be universal. But again, wouldn't that eventually be too restricting for any pragmatic usage of CPOV?
This issue was resolved in the meeting on 2016-02-15 https://joinup.ec.europa.eu/asset/cpov/document/cpov-wg-virtual-meeting…
The CPOV will include a generic class of Event that has two sub classes: the change Event class as now and an additional class of Foundation Event. The meeting did not discuss the wider issue of versioning.