I’m generating some RDF files with Jena . The whole application works with utf-8

Question

0

Asked: June 12, 20262026-06-12T04:43:53+00:00 2026-06-12T04:43:53+00:00

I’m generating some RDF files with Jena . The whole application works with utf-8

0

I’m generating some RDF files with Jena. The whole application works with utf-8 text. The source code as well is stored in utf-8.

When I print a string contaning non-English characters on the console, I get the right format, e.g. Est un lieu généralement officielle assis....

Then, I use the RDF writer to output the file:

Model m = loadMyModelWithMultipleLanguages()
log.info( getSomeStringFromModel(m) ) // log4j, correct output
RDFWriter w = m.getWriter( "RDF/XML" ) // default enc: utf-8
w.setProperty("showXmlDeclaration","true") // optional  
OutputStream out = new FileOutputStream(pathToFile)
w.write( m, out, "http://someurl.org/base/" )
// file contains garbled text

The RDF file starts with: <?xml version="1.0"?>. If I add utf-8 nothing changes.

By default the text should be encoded to utf-8.
The resulting RDF file validates ok, but when I open it with any editor/visualiser (vim, Firefox, etc.), non-English text is all messed up: Est un lieu g√©n√©ralement officielle assis ... or Est un lieu g\u221A\u00A9n\u221A\u00A9ralement officielle assis....
(Either way, this is obviously not acceptable from the user’s viewpoint).
The same issue happens with any output format supported by Jena (RDF, NT, etc.).

I can’t really find a logical explanation to this.
The official documentation doesn’t seem to address this issue.

Any hint or tests I can run to figure it out?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T04:43:54+00:00

My guess would be that your strings are messed up, and your printStringFromModel() method just happens to output them in a way that accidentally makes them display correctly, but it’s rather hard to say without more information.

You’re instructing Jena to include an XML declaration in the RDF/XML file, but don’t say what encoding (if any) Jena declares in the XML declaration. This would be helpful to know.

You’re also not showing how you’re printing the strings in the printStringFromModel() method.

Also, in Firefox, go to the View menu and then to Character Encoding. What encoding is selected? If it’s not UTF-8, then what happens when you select UTF-8? Do you get it to show things correctly when selecting some other encoding?

Edit: The snippet you show in your post looks fine and should work. My best guess is that the code that reads your source strings into a Jena model is broken, and reads the UTF-8 source as ISO-8859-1 or something similar. You should be able to confirm or disconfirm that by checking the length() of one of the offending strings: If each of the troublesome characters like é are counted as two, then the error is on reading; if it’s correctly counted as one, then it’s on writing.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m generating some RDF files with Jena . The whole application works with utf-8

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply