I’m starting to work with the Jena Engine and I think I got a grasp of what semantics are.
However I’m having a hard time understanding the different ways to represent a bunch of triples in Jena and ARQ:
- The first thing you stumble upon when starting is
Modeland the documentation says its Jenas name for RDF graphs. - However there is also
Graphwhich seemed to be the necessary tool when I want to query a union of models, however it does not seem to share a common interface withModel, although one can get theGraphout of aModel - Then there is
DataSetin ARQ, which also seems to be a collection of triples of some sort.
Sure, afer some looking around in the API, I found ways to somehow convert from one into another. However I suspect there is more to it than 3 different interfaces for the same thing.
So, question is: What are the key design differences between these three? When should I use which one ? Especially: When I want to hold individual bunches of triples but query them as one big bunch (union), which of these datastructures should I use (and why)?
Also, do I “loose” anything when “converting” from one into another (e.g. does model.getGraph() contain less information in some way than model)?
Jena is divided into an API, for application developers, and an SPI for systems developers, such as people making storage engines, reasoners etc.
DataSet,Model,Statement,ResourceandLiteralare API interfaces and provide many conveniences for application developers.DataSetGraph,Graph,Triple,Nodeare SPI interfaces. They’re pretty spartan and simple to implement (as you’d hope if you’ve got to implement the things).The wide variety of API operations all resolve down to SPI calls. To give an example the
Modelinterface has four differentcontainsmethods. Internally each results in a call:such as
Concerning your question about losing information, with
ModelandGraphyou don’t (as far as I recall). The more interesting case isResourceversusNode.Resourcesknow which model they belong to, so you can (in the api) writeresource.addProperty(...)which becomes aGraph#addeventually.Nodehas no such convenience, and is not associated with a particularGraph. HenceResource#asNodeis lossy.Finally:
You’re clearly a normal user, so you want the API. You want to store triples, so use
Model. Now you want to query the models as one union: You could:Model#union()everything, which will copy all the triples into a new model.ModelFactory.createUnion()everything, which will create a dynamic union (i.e. no copying).unionDefaultGraphoption.The last of these works best for large numbers of models, and large model, but is a little more involved to set up.