I am quite comfortable using SQL but having an impossible time understanding SPARQL. For starters, I don’t even understand how to look at the structure of the data (in MySQL I would just do describe <table name>) so I can query the appropriate fields.
Is there a way for me to import an entire RDF dataset into respective tables in a MySQL database?
Barring that, is there a way to SELECT * from all the tables (or whatever the equivalent descriptor is) such that I can get all the output data into csv (and take it from there?)
The RDF dataset I am trying to query has a SPARQL endpoint and even a guide on How to SPARQL but I am having a hard time understanding it.
For example:
PREFIX meannot: <http://rdf.myexperiment.org/ontologies/annotations/>
PREFIX sioc: <http://rdfs.org/sioc/ns#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX mebase: <http://rdf.myexperiment.org/ontologies/base/>
SELECT DISTINCT ?annotator_name
WHERE {
?comment mebase:annotates <http://www.myexperiment.org/workflows/52> .
?comment rdf:type meannot:Comment .
?comment mebase:has-annotator ?annotator
?annotator sioc:name ?annotator_name
}
makes little sense to me. Why is there a period at the end of some of the WHERE statements but not others? and what does ?comment mebase:has-annotator ?annotator mean in plain English? Select the annotators name where annotators name is the annotators name? huh?
I would be grateful for any resources that you could point me to.
Although SPARQL looks SQL like in its syntax how it functions is actually quite different which is the problem you and many others have when trying to learn it.
Pattern Matching
SPARQL is about triple pattern matching rather than selecting from tables like SQL. Each set of three items in your example represents a triple pattern. So for example:
This tells the SPARQL processor to find any thing which has
rdf:typeofmeannot:Commenti.e. things which are of type comment. In this pattern?commentis a variable which acts like a wildcard, think of this as a field in SQL that you can selectIf we add in additional triple pattern that uses a variable then we are asking the SPARQL processor to find all things which match all triple patterns, so:
This finds things which are comments on a specific item.
In SQL terms this would be like writing
SELECT commentID FROM COMMENTS WHERE itemID=1234if that helps you understand it.As we start adding in additional variables you can think of that as doing joins with other tables:
This finds things which are comments and the users that made them on a specific item
It would be roughly equivalent to
SELECT commentID, userID FROM COMMENTS C INNER JOIN USERS U ON C.userID=U.userID WHERE itemID=1234in SQLSyntax Notes
As far as the syntax goes the
.denotes the end of a triple pattern.The fact that it is omitted in your example is actually an error on the part of the people publishing that how to guide. I happen to work in one of the universities who are involved in the project so I have dropped a colleague a note asking them to fix this.
What you may also see in examples is the use of
;at the end of a triple pattern. These are shorthands for repeating the subject e.g.Means that you don’t have to type out
?commentagain for the subsequent pattern.Similarily
,is used to repeat the subject and the predicate:Would mean that
?commentandrdf:typeare repeated, in plain english the above would be things which are of type comment and of type annotationDiscovering the data structure
RDF is not stored in tables since it is a schemaless data model, the closest thing to tables are named graphs which are just a way to logically group sets of triples together.
Take a look at this question on exploratory SPARQL queries for some suggestions on queries to try.
If you just want to select everything you can do
SELECT * WHERE { ?s ?p ?o }– beware that many endpoints will impose a limit on the number of results for one query so even if the endpoint has millions of triples behind it you may get only a few thousand back. You can page through results usingLIMITandOFFSETe.g.If you just want to get all the data to trawl through try looking around on a site to see if they offer an RDF dump which will typically be a zipped archive with a bunch of RDF files in it. This will let you look at the data locally
Putting RDF into SQL tables
There are systems that will let you store RDF in SQL based databases but take it from someone who’s worked with a large variety of triple stores this is nowhere near as performant as using a native triple store.
You may be interested in R2RML which is a new W3C standard (currently in early working draft) which defines a standard way to map relational data to RDF. Some of their documentation may help you better understand the relationship between RDF/SPARQL and SQL
Tutorials
For a fuller tutorial I’d check out SPARQL by Example which is by one of the authors of the SPARQL specification and is highly recommended