I am using the following example from Lingua::StopWords : use Lingua::StopWords qw( getStopWords );

Question

0

Asked: June 3, 20262026-06-03T11:12:41+00:00 2026-06-03T11:12:41+00:00

I am using the following example from Lingua::StopWords : use Lingua::StopWords qw( getStopWords );

0

I am using the following example from Lingua::StopWords:

use Lingua::StopWords qw( getStopWords );
my $stopwords = getStopWords('en');

my @words = qw( i am the walrus goo goo g'joob );

# prints "walrus goo goo g'joob"
print join ' ', grep { !$stopwords->{$_} } @words;

How do I get it to use my $document, remove stopwords and print the results to a file? See my code here:

open(FILESOURCE, "sample.txt") or die("Unable to open requested file.");
my $document = <FILESOURCE>;
close (FILESOURCE);

open(TEST, "results_stopwords.txt") or die("Unable to open requested file.");

use Lingua::StopWords qw( getStopWords );
my $stopwords = getStopWords('en');

print join ' ', grep { !$stopwords->{$_} } $document;

I tried these variations:

print join ' ', grep { !$stopwords->{$_} } TEST;


print TEST join ' ', grep { !$stopwords->{$_} } @words;

Basically, how do I read in a document, remove the stop words and then write the result to a new file?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-03T11:12:43+00:00

In your program, you forgot to tokenise the input text into words. A simplistic alternative to Lingua::EN::Splitter::words is to split a line on spaces into a list of words (approximately).

Taking tchrist‘s comment in account, this program is fit to be a Unix filter.

use strictures;
use Lingua::StopWords qw(getStopWords);
use Lingua::EN::Splitter qw(words);
my $stopwords = getStopWords('en');
while (defined(my $line = <>)) {
    print join ' ', grep { !$stopwords->{$_} } @{ words $line };
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using the following example from Lingua::StopWords : use Lingua::StopWords qw( getStopWords );

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply