I have an XML document collection, an inverted file indexer, and a command-line tool

Question

0

Asked: May 27, 20262026-05-27T09:16:54+00:00 2026-05-27T09:16:54+00:00

I have an XML document collection, an inverted file indexer, and a command-line tool

0

I have an XML document collection, an inverted file indexer, and a command-line tool for searching the index (or indices) produced by the indexer. Note that the latter returns a list of document IDs and various statistics about each document (rankings according to various functions, term hits, etc) rather than the actual document text. Both programs were written in straight C (by me).

The collection is not huge (~1GB).
The index is about 10-20% of the collection size.
This is not intended (and never will be) for public use (using it will require logging in).
It needs to run with client-side scripting totally disabled.

I’d like to whip up a simple web frontend that would allow me to query the index with a search term or terms and present the results appropriately, but it’s been a while since I touched any web stuff.

I want to see more or less the same info a query returns at the moment, but I’m not sure whether to write something (e.g. PHP, Ruby – alternative suggestions are welcome) that calls my command-line query program and processes the output, or whether re-implementing the query program would be more appropriate.

Are there any distinct advantages one has over the other? Security risks?
And can anyone recommend me a lightweight framework or library appropriate for any of this? (Like I said, haven’t touched web stuff in a while.)

Should I call the CLI query program? Why or why not?

(=/ I hope I’m not being too vague… do tell me if I should be asking this in a different manner.)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T09:16:54+00:00

For something simple like this, I would use PHP and an Apache server. Why?

It doesn’t require a web framework to interface between Apache; less complexity = less time for you to spend configuring. You could just install Apache and the php module, then drop in this file in your web-root, and point a html form to http://127.0.0.1/indexer.php with the textareas "name" and "author":

<?php
$required_terms = array("name", "author");

foreach ($required_terms as $value) {
    if (!isset($_POST[$value])) {
        printf("The search term \"%s\" was missing", $value);
        exit;
    }
}

$terminal_command = sprintf("/usr/bin/indexer -n %s -a %s", $_POST["name"], $_POST["author"]);
print exec($terminal_command);

(Note this is just to show the simplicity, it needs validation of the post values received).

Then this would run your application with the 2 values as arguments, then print whatever was sent to stdout by your application. No more hassle or things to setup. It would take you a couple of minutes to get up and running.

So the main reason would be simple and fast to setup, for something internal and simple as this.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an XML document collection, an inverted file indexer, and a command-line tool

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply