I have a big xml database (30 000 files, 1.3 Go). One file in this database lists all the other files present in the database. My aim is “simply” to check if all files listed are present in the database. BUT I must not take care of the name of the files but only the XML code inside the documents.
This is something like that :
declare variable $root := fn:collection();
declare function local:isValid($fileCode) {
let $fileSearchedIdentCode := $root/dmodule/identity/dmCode
return
$fileCode/@attribute1 = $fileSearchedIdentCode/@attribute1 and
$fileCode/@attribute2 = $fileSearchedIdentCode/@attribute2 and
$fileCode/@attribute3 = $fileSearchedIdentCode/@attribute3
};
<result>
{
for $fileCode in $root/file[identity/@fileType eq 'listOfFiles']/fileContent/fileEntry/fileCode
return
if (local:isValid($fileCode))
then <filePresent>1</filePresent>
else <fileNonPresent>2</fileNonPresent>
}
</result>
The code above is running for a small databse but for mine, it is requiring a incredible amount of time.
SO, I wonder if someone can help me to improve that code in order to execute it in a reasonable time 😉
(My database is indexed)
Thanks for your help !!
Johann
It seems that the Attribute Index isn’t applied to the attribute checks in the
local:isValidfunction. You can achieve that by rewriting them as XPath predicates:After these changes, the Query Info view in BaseX tells me that the index is used:
and the evaluation time drops from 4’500ms to ~20ms for my test data.