Fooling around more with the Perl Plucene module and, having created my index, I

Question

0

Asked: June 6, 20262026-06-06T12:41:38+00:00 2026-06-06T12:41:38+00:00

Fooling around more with the Perl Plucene module and, having created my index, I

0

Fooling around more with the Perl Plucene module and, having created my index, I am now trying to search it and return results.

My code to create the index is here…chances are you can skip this and read on:

#usr/bin/perl
use Plucene::Document;
use Plucene::Document::Field;
use Plucene::Index::Writer;
use Plucene::Analysis::SimpleAnalyzer;
use Plucene::Search::HitCollector;
use Plucene::Search::IndexSearcher;
use Plucene::QueryParser;
use Try::Tiny;
my $content = $ARGV[0];
my $doc = Plucene::Document->new;
my $i=0;
$doc->add(Plucene::Document::Field->Text(content => $content));
my $analyzer = Plucene::Analysis::SimpleAnalyzer->new();

if (!(-d "solutions" )) {
        $i = 1;
}

if ($i)
{
    my $writer = Plucene::Index::Writer->new("solutions", $analyzer, 1); #Third param is 1 if creating new index, 0 if adding to existing
    $writer->add_document($doc);
    my $doc_count = $writer->doc_count;
    undef $writer; # close
}
else
{
    my $writer = Plucene::Index::Writer->new("solutions", $analyzer, 0);
    $writer->add_document($doc);
    my $doc_count = $writer->doc_count;
    undef $writer; # close
}

It creates a folder called “solutions” and various files to it…I’m assuming indexed files for the doc I created. Now I’d like to search my index…but I’m not coming up with anything. Here is my attempt, guided by the Plucene::Simple examples of CPAN. This is after I ran the above with the param “lol” from the command line.

#usr/bin/perl  

  use Plucene::Simple;

  my $plucy = Plucene::Simple->open("solutions");
  my @ids = $plucy->search("content : lol"); 
  foreach(@ids)
  {
    print $_;
  }

Nothing is printed, sadly )-=. I feel like querying the index should be simple, but perhaps my own stupidity is limiting my ability to do this.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T12:41:39+00:00

Three things I discovered in time:

Plucene is a grossly inefficient proof-of-concept and the Java implementation of Lucene is BY FAR the way to go if you are going to use this tool. Here is some proof: http://www.kinosearch.com/kinosearch/benchmarks.html
Lucy is a superior choice that does the same thing and has more documentation and community (as per the comment on the question).
How to do what I asked in this problem.

I will share two scripts – one to import a file into a new Plucene index and one to search through that index and retrieve it. A truly working example of Plucene…can’t really find it easily on the Internet. Also, I had tremendous trouble CPAN-ing these modules…so I ended up going to the CPAN site (just Google), getting the tar’s and putting them in my Perl lib (I’m on Strawberry Perl, Windows 7) myself, however haphazard. Then I would try to run them and CPAN all the dependencies that it cried for. This is a sloppy way to do things…but it’s how I did them and now it works.

#usr/bin/perl
use strict;
use warnings;
use Plucene::Simple;
my $content_1 = $ARGV[0];
my $content_2 = $ARGV[1];
my %documents;

 %documents = (
"".$content_2 => { 

                     content => $content_1
                   }
);

print $content_1;
my $index = Plucene::Simple->open( "solutions" );
for my $id (keys %documents) 
{
        $index->add($id => $documents{$id});
}
 $index->optimize;

So what does this do…you call the script with two command line arguments of your choosing – it creates a key-value pair of the form “second argument” => “first argument”. Think of this like the XMLs in the tutorial at the apache site (http://lucene.apache.org/solr/api/doc-files/tutorial.html). The second argument is the field name.

Anywho, this will make a folder in the directory the script was run in – in that folder will be files made by lucene – THIS IS YOUR INDEX!! All we need to do now is search that index using the power of Lucene, something made easy by Plucene. The script is the following:

#usr/bin/perl  
use strict;
use warnings;
use Plucene::Simple;
my $content_1 = $ARGV[0];
my $index = Plucene::Simple->open( "solutions" );


my (@ids, $error);
my $query = $content_1;
@ids = $index->search($query);
foreach(@ids)
{
    print $_."---seperator---";
}

You run this script by calling it from the command line with ONE argument – for example’s sake let it be the same first argument as you called the previous script. If you do that you will see that it prints your second argument from the example before! So you have retrieved that value! And given that you have other key-value pairs with the same value, this will print those too! With “—seperator—” between them!

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Fooling around more with the Perl Plucene module and, having created my index, I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply