Appreciating the SPARQL property path slash character more
Querying for labels and more.
I’ve understood SPARQL’s property path features well enough to demo them in the “Searching Further in the Data” section of my book Learning SPARQL. (See example files ex074 - ex085.) To be honest, I have very rarely used them in actual queries that I’ve written. I’ve only just realized how the property path slash operator can help with a pattern that I have used in a large percentage of my queries. It makes these queries more concise and removes at least one variable that would not have been in my SELECT
statement anyway.
As an example, here is some very simple data about three people and who follows who on social media:
@prefix schema: <http://schema.org> .
@prefix d: <http://learningsparql.com/ns/data#> .
d:i0432 d:name "Richard Mutt" .
d:i9771 d:name "Cindy Marshall" .
d:i8301 d:name "Craig Ellis" .
@prefix schema: <http://schema.org/> .
d:i0432 schema:follows d:i9771, d:i8301.
If I want to list who Richard follows, I want to list their actual names, not their URIs. This would be an obvious query to do that:
PREFIX d: <http://learningsparql.com/ns/data#>
PREFIX schema: <http://schema.org/>
SELECT ?name WHERE {
?follower d:name "Richard Mutt" ;
schema:follows ?person .
?person d:name ?name .
}
It finds the URIs of the people that Richard follows, stores them in the ?person
variable, and then finds the d:name
value of each of those people. Having a query find resources that meet a certain condition and then using another triple pattern to get the human-readable names of those resources (and then using those names in the SELECT
statement) is extremely common in SPARQL.
The property path slash character lets me do the same thing with no need for the ?person
variable in the previous query. This next query asks, for each resource that Richard follows, what their name is:
PREFIX d: <http://learningsparql.com/ns/data#>
PREFIX schema: <http://schema.org/>
SELECT ?name WHERE {
?follower d:name "Richard Mutt" ;
# For each followed resource, what is its name?
schema:follows/d:name ?name .
}
In graph terms, we store the URI of Richard Mutt’s node in the ?follower
variable, then traverse schema:follows
graph edges to any nodes that then have a d:name
edge, and then we store each value that the d:name
edge leads to in the ?name
variable.
I don’t think that it’s intuitively very readable, which is why I added the comment in the query, but perhaps as I use this more I will get used to it. (Note also that the comment doesn’t ask “What is the name of each followed resource?”; I wanted it to reflect the syntax it describes a little more closely.)
This is such a common pattern that I wanted to show some examples from more real-life contexts. The following query asks Wikidata for the names of the members of Daft Punk. It does this by storing the URI representing each member of the group in the ?member
variable, and it then asks for the rdfs:label
value of each, filtered to only show the English representation. (You can execute this query with the Wikidata Query Service yourself.)
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?name WHERE {
wd:Q185828 wdt:P527 ?member .
?member rdfs:label ?name .
FILTER(lang(?name) = "en")
}
But, we don’t need that ?member
variable and second triple pattern! We can just do this:
PREFIX wd: <http://www.wikidata.org/entity/>
SELECT ?name WHERE {
# For each member of Daft Punk, what is their name?
wd:Q185828 wdt:P527/rdfs:label ?name .
FILTER(lang(?name) = "en")
}
Run this second query and you will see the same results as the query before it.
I could do this with something besides names, such as their birth dates, but a list of dates with no context about what resources they describe isn’t very helpful. (Using it for names also just happens to build on a theme of recent entries in my blog, Human-readable names in RDF and Querying for labels.)
As another example, I was going to create a query for the Rhizome Artbase SPARQL endpoint that I wrote about in Generating websites with SPARQL and Snowman, part 1. Then, I realized that I could use a query that was already in that blog entry, which you can run yourself:
PREFIX rt: <https://artbase.rhizome.org/prop/direct/>
SELECT DISTINCT ?artistName WHERE {
?artwork rt:P29 ?artist .
?artist rdfs:label ?artistName .
}
ORDER BY (?artistName)
LIMIT 250
This time, we’ll remove the ?artist
variable from the end of the first triple pattern and the beginning of the second and create a property path out of rt:P29
and rdfs:label
:
PREFIX rt: <https://artbase.rhizome.org/prop/direct/>
SELECT DISTINCT ?artistName WHERE {
?artwork rt:P29/rdfs:label ?artistName .
}
ORDER BY (?artistName)
LIMIT 250
Run this one and you’ll see the same result as the previous query.
PREFIX rt: <https://artbase.rhizome.org/prop/direct/>
SELECT * WHERE {
?artist rdfs:label "Jessica Gomula"@en .
?artwork rt:P29 ?artist .
?artwork rdfs:label ?name .
}
Has anyone else found a particular property path pattern to be worth using in a high percentage of their SPARQL queries?
Comments? Reply to my tweet (or even better, my Mastodon message) announcing this blog entry.
Share this post