RDF* & SPARQL*: Difference between revisions

From artserver wiki
No edit summary
No edit summary
Line 1: Line 1:
RDF* & SPARQL* provide very compact way to annotate triples, allowing qualifiers, source and time ranges to be associated with triples, without having to resort to the creation of [https://www.w3.org/TR/swbp-n-aryRelations/#pattern1 of new class instance] to represent the relation between 2 properties.
==Example: Population of Spain ==
The ttl below is an extract of Wikidata [https://www.wikidata.org/wiki/Q29 Q29(Spain)], focusing on its [https://www.wikidata.org/wiki/Property:P1082 P1082 (population)] property statements. The statements includes not only the population value, but also the [https://www.wikidata.org/wiki/Property:P585 P585(point_in_time)] property, since populations of a location change with time.
To express the n-ary relation: ''Spain's population was 30455000 at point_in_time 1960'' the common approach, taken also in the Wikidata example is to create a class instance to annotate the triple:  ''Spain's population 30455000'' with the ''point_in_time 1960'' statement. If we break it down, we see:
* <code>wd:Q29 p:P1082 s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 .</code> ''Spain population s:Q29-47E...''
* <code>s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 a wikibase:Statement ;</code> ''s:Q29-47E... is a Statement'' that states:
* <code>ps:P1082 "+30455000"^^xsd:decimal ;</code> ''population 30455000''
* <code>pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime .</code> ''at point_in_time 1960''
Which gets a bit verbose and not very readable
'''Data:'''
'''Data:'''
<source lang="ttl">
<source lang="ttl">
Line 75: Line 89:
</pre>
</pre>


Here we can see how using RDF* to annotate the Spain's population triple, with point_in_time property, becomes far more compact and readable, without having to resort to the creation of another class instance to handle it.
By simply stating that:
* <code>wd:Q29 ps:P1082 "+30455000"^^xsd:decimal ;</code> Spain's population 30455000
* <code><< wd:Q29 ps:P1082 "+30455000"^^xsd:decimal >> pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime .</code> Spain's population 30455000  at point_in_time 1960


'''Data (using RDF*):'''
'''Data (using RDF*):'''
Line 127: Line 147:
-----------------------------------------------------------------------------
-----------------------------------------------------------------------------
</pre>
</pre>
=Challenges=
==Same value for more than one annotation==
Problem: Let's suppose the population of Spain in 1960 and 1970 is the same. It could be an issue, since we will be annotation 2 triples which say exactly the same.
Solution: state that the population was 30455000 in both 1960 and 1970.
<source lang="ttl">
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix ps: <http://www.wikidata.org/prop/statement/> .
@prefix pq: <http://www.wikidata.org/prop/qualifier/> .
wd:Q29 a wikibase:Item;
ps:P1082 "+30455000"^^xsd:decimal ;
ps:P1082 "+30455000"^^xsd:decimal ;
ps:P1082 "+37439035"^^xsd:decimal .
<< wd:Q29 ps:P1082 "+30455000"^^xsd:decimal >> pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime ;
                                              pq:P585 "1970-01-01T00:00:00Z"^^xsd:dateTime .
<< wd:Q29 ps:P1082 "+37439035"^^xsd:decimal >> pq:P585 "1980-01-01T00:00:00Z"^^xsd:dateTime .
</source>
==annotation between boundaries (from - to)==
Let's suppose we want to state that between 1960 and 1970 the population of Spain did not change (unlikely, but then ''anyone can say anything, at anytime'')





Revision as of 16:02, 6 January 2021

RDF* & SPARQL* provide very compact way to annotate triples, allowing qualifiers, source and time ranges to be associated with triples, without having to resort to the creation of of new class instance to represent the relation between 2 properties.


Example: Population of Spain

The ttl below is an extract of Wikidata Q29(Spain), focusing on its P1082 (population) property statements. The statements includes not only the population value, but also the P585(point_in_time) property, since populations of a location change with time.

To express the n-ary relation: Spain's population was 30455000 at point_in_time 1960 the common approach, taken also in the Wikidata example is to create a class instance to annotate the triple: Spain's population 30455000 with the point_in_time 1960 statement. If we break it down, we see:

  • wd:Q29 p:P1082 s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 . Spain population s:Q29-47E...
  • s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 a wikibase:Statement ; s:Q29-47E... is a Statement that states:
  • ps:P1082 "+30455000"^^xsd:decimal ; population 30455000
  • pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime . at point_in_time 1960

Which gets a bit verbose and not very readable

Data:

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix s: <http://www.wikidata.org/entity/statement/> .
@prefix p: <http://www.wikidata.org/prop/> .
@prefix ps: <http://www.wikidata.org/prop/statement/> .
@prefix pq: <http://www.wikidata.org/prop/qualifier/> .

wd:Q29 a wikibase:Item .

wd:Q29 p:P1082 s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 .

s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 a wikibase:Statement ;
	ps:P1082 "+30455000"^^xsd:decimal ;
	pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime .

wd:Q29 p:P1082 s:Q29-38AA233B-6CFF-4F9C-A73C-D0B23AC44E74 .

s:Q29-38AA233B-6CFF-4F9C-A73C-D0B23AC44E74 a wikibase:Statement ;
	ps:P1082 "+33814531"^^xsd:decimal ;
	pq:P585 "1970-01-01T00:00:00Z"^^xsd:dateTime .

wd:Q29 p:P1082 s:Q29-7BD19893-2B47-4028-956B-329344307600 .

s:Q29-7BD19893-2B47-4028-956B-329344307600 a wikibase:Statement ;
	ps:P1082 "+37439035"^^xsd:decimal ;
	pq:P585 "1980-01-01T00:00:00Z"^^xsd:dateTime .

wd:Q29 p:P1082 s:Q29-D0602463-6F4B-40BC-833D-45B216E354BE .

s:Q29-D0602463-6F4B-40BC-833D-45B216E354BE a wikibase:Statement ;
	ps:P1082 "+38850435"^^xsd:decimal ;
	pq:P585 "1990-01-01T00:00:00Z"^^xsd:dateTime .

wd:Q29 p:P1082 s:Q29-65A1C6CA-806A-49F4-9FA5-6BF600B82970 .

s:Q29-65A1C6CA-806A-49F4-9FA5-6BF600B82970 a wikibase:Statement ;
	ps:P1082 "+40263216"^^xsd:decimal ;
	pq:P585 "2000-01-01T00:00:00Z"^^xsd:dateTime .

Query:

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX p: <http://www.wikidata.org/prop/> 
PREFIX s: <http://www.wikidata.org/entity/statement/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/> 

# get the population and year of the statement
SELECT * 
WHERE {
 ?subject p:P1082 ?statementobject.
 ?statementobject ps:P1082 ?population;
 				  pq:P585 ?statementDate .
}
ORDER BY ?statementDate

Results:

arq --data=Q29_population.ttl --query=population_query.rq
--------------------------------------------------------------------------------------------------------------------------
| subject | statementobject                            | population               | statementDate                        |
==========================================================================================================================
| wd:Q29  | s:Q29-47E327E5-127D-4DC3-8C3F-9B2C7D5A0D62 | "+30455000"^^xsd:decimal | "1960-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | s:Q29-38AA233B-6CFF-4F9C-A73C-D0B23AC44E74 | "+33814531"^^xsd:decimal | "1970-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | s:Q29-7BD19893-2B47-4028-956B-329344307600 | "+37439035"^^xsd:decimal | "1980-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | s:Q29-D0602463-6F4B-40BC-833D-45B216E354BE | "+38850435"^^xsd:decimal | "1990-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | s:Q29-65A1C6CA-806A-49F4-9FA5-6BF600B82970 | "+40263216"^^xsd:decimal | "2000-01-01T00:00:00Z"^^xsd:dateTime |
--------------------------------------------------------------------------------------------------------------------------


Here we can see how using RDF* to annotate the Spain's population triple, with point_in_time property, becomes far more compact and readable, without having to resort to the creation of another class instance to handle it.

By simply stating that:

  • wd:Q29 ps:P1082 "+30455000"^^xsd:decimal ; Spain's population 30455000
  • << wd:Q29 ps:P1082 "+30455000"^^xsd:decimal >> pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime . Spain's population 30455000 at point_in_time 1960

Data (using RDF*):

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix ps: <http://www.wikidata.org/prop/statement/> .
@prefix pq: <http://www.wikidata.org/prop/qualifier/> .

wd:Q29 a wikibase:Item;
	ps:P1082 "+30455000"^^xsd:decimal ;
	ps:P1082 "+33814531"^^xsd:decimal ;
	ps:P1082 "+37439035"^^xsd:decimal ;
	ps:P1082 "+38850435"^^xsd:decimal ;
	ps:P1082 "+40263216"^^xsd:decimal .

<< wd:Q29 ps:P1082 "+30455000"^^xsd:decimal >> pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime  .
<< wd:Q29 ps:P1082 "+33814531"^^xsd:decimal >> pq:P585 "1970-01-01T00:00:00Z"^^xsd:dateTime .
<< wd:Q29 ps:P1082 "+37439035"^^xsd:decimal >> pq:P585 "1980-01-01T00:00:00Z"^^xsd:dateTime .
<< wd:Q29 ps:P1082 "+38850435"^^xsd:decimal >> pq:P585 "1990-01-01T00:00:00Z"^^xsd:dateTime .
<< wd:Q29 ps:P1082 "+40263216"^^xsd:decimal >> pq:P585 "2000-01-01T00:00:00Z"^^xsd:dateTime .


Query (using SPARQL*):

PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/> 

# get the population and year of the statement
SELECT * 
WHERE {
 <<?subject ps:P1082 ?population>> pq:P585 ?statementDate .
}
ORDER BY ?statementDate

Results:

arq --data=Q29_population_star.ttl --query=population_query_star.rq
-----------------------------------------------------------------------------
| subject | population               | statementDate                        |
=============================================================================
| wd:Q29  | "+30455000"^^xsd:decimal | "1960-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | "+33814531"^^xsd:decimal | "1970-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | "+37439035"^^xsd:decimal | "1980-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | "+38850435"^^xsd:decimal | "1990-01-01T00:00:00Z"^^xsd:dateTime |
| wd:Q29  | "+40263216"^^xsd:decimal | "2000-01-01T00:00:00Z"^^xsd:dateTime |
-----------------------------------------------------------------------------


Challenges

Same value for more than one annotation

Problem: Let's suppose the population of Spain in 1960 and 1970 is the same. It could be an issue, since we will be annotation 2 triples which say exactly the same.

Solution: state that the population was 30455000 in both 1960 and 1970.

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix ps: <http://www.wikidata.org/prop/statement/> .
@prefix pq: <http://www.wikidata.org/prop/qualifier/> .

wd:Q29 a wikibase:Item;
	ps:P1082 "+30455000"^^xsd:decimal ;
	ps:P1082 "+30455000"^^xsd:decimal ;
	ps:P1082 "+37439035"^^xsd:decimal .

<< wd:Q29 ps:P1082 "+30455000"^^xsd:decimal >> pq:P585 "1960-01-01T00:00:00Z"^^xsd:dateTime ; 
                                               pq:P585 "1970-01-01T00:00:00Z"^^xsd:dateTime .
<< wd:Q29 ps:P1082 "+37439035"^^xsd:decimal >> pq:P585 "1980-01-01T00:00:00Z"^^xsd:dateTime .

annotation between boundaries (from - to)

Let's suppose we want to state that between 1960 and 1970 the population of Spain did not change (unlikely, but then anyone can say anything, at anytime)



... more about "RDF* & SPARQL*"
Code_Notes +
Date"Date" is a type and predefined property provided by Semantic MediaWiki to represent date values.
2021 +