I’m a beginner in xquery and I hope you can help me with an

Question

0

Editorial Team

Asked: June 12, 20262026-06-12T05:41:32+00:00 2026-06-12T05:41:32+00:00

I’m a beginner in xquery and I hope you can help me with an

0

I’m a beginner in xquery and I hope you can help me with an easy explanation. I’m using BaseX 7.0.1.

I have a dictionary.xml file that looks like this :

<doc>
    <entry>
        <vedette>je</vedette>
        <variante>je</variante>
        <variante>j'</variante>
        <partiedudiscours>pronom</partiedudiscours>
    </entry>
</doc>

And I have another malone_fr.xml file that contains the text that I’d like to annotate, that looks like this :

<doc>
    L’Opportunité 
    Par : Walter Malone (1866-1915)
    Ils ont mal conclu ceux qui disent que je ne reviendrai plus
    Quand une fois j’ai frappé à ta porte et ne t’ai pas rencontré,
</doc>

So I’d like to compare the content of the < variante > part of dictionary.xml with my text, and markup the text with the content of < partiedudiscours >.
So far, I’ve been able to do that with this code :

let $comp := data(for $j in tokenize(for $i in db:open('malone_fr')/doc return $i,"\n") 
return tokenize($j," "))
for $aa in $comp
return
for $lemme in db:open('dictionnaire')/doc/entry
return
let $oldName :=$aa
return
if ($oldName= $lemme/variante)
then 
let $newName := element  {$lemme/partiedudiscours}  {$aa}
return
for $bb in $comp
return
if ($bb=$oldName)
then $newName 
else ($bb)
else ()

That gives me the following result:
[first iteration]

L’Opportunité  Par : Walter Malone (1866-1915) Ils<verbe>ont</verbe> mal conclu ceux qui disent que je ne reviendrai plus

[second iteration]

L’Opportunité  Par : Walter Malone (1866-1915) <pronom>Ils</pronom>ont mal conclu ceux qui disent que je ne reviendrai plus

As you can see, it only shows the result per word by iteration, whereas I need a result with the whole text annotated like:

L’Opportunité  Par : Walter Malone (1866-1915) <pronom>Ils</pronom><verbe>ont</verbe> <adverbe>mal</adverbe> <verb>conclu</verb>

Etc.
I don’t know how I can deal with the for-loop to do that.

Thanks in advance.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T05:41:33+00:00

I think your solution is a little more complicated than it needs to be. You should be able to do this in one loop. Using XPath to perform the lookup – instead of explicitly looping over all the values in your dictionary – will allow your database to optimize for faster retrieval of the dictionary data.

let $toks := data(
    for $i in db:open('malone_fr')/doc 
    return tokenize($i,"\s"))
for $t in $toks
return
    let $e := $dict/entry[variante = $t]    
    return
        if ($e)
        then (element { $e/partiedudiscours } { $t }, text{" "})
        else ($t, text{" "})

Also, the tokenize() step discards spaces, so no spaces exist in your output sequence. It will only appear spaced because that is typically the default method of rendering a sequence of atomic types; however, as you can see from your test output, spaces are not rendered around elements. In the above solution I added very basic space handling so elements are also correctly spaced. You can remove the text{" "} nodes if they are not needed.

Update: added @DennisKnochenwefel’s suggestion

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m a beginner in xquery and I hope you can help me with an

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply