I have 2 files, a small one and a big one. The small file

Question

0

Editorial Team

Asked: June 5, 20262026-06-05T04:32:31+00:00 2026-06-05T04:32:31+00:00

I have 2 files, a small one and a big one. The small file

0

I have 2 files, a small one and a big one. The small file is a subset of the big one.

For instance:

Small file:

solar:1000
alexey:2000

Big File:

andrey:1001
solar:1000
alexander:1003
alexey:2000

I want to delete all the lines from Big.txt which are also present in Small.txt. In other words, I want to delete the lines in Big file which are common to the small File.

So, I wrote a Perl Script as shown below:

#! /usr/bin/perl

use strict;
use warnings;

my ($small, $big, $output) = @ARGV;

open(BIG, "<$big") || die("Couldn't read from the file: $big\n");
my @contents = <BIG>;
close (BIG);

open(SMALL, "<$small") || die ("Couldn't read from the file: $small\n");

while(<SMALL>)
{
    chomp $_;
    @contents = grep !/^\Q$_/, @contents;
}

close(SMALL);

open(OUTPUT, ">>$output") || die ("Couldn't open the file: $output\n");

print OUTPUT @contents;
close(OUTPUT);

However, this Perl Script does not delete the lines in Big.txt which are common to Small.txt

In this script, I first open the big file stream and copy the entire contents into the array, @contents. Then, I iterate over each entry in the small file and check for its presence in the bigger file. I filter the line from Big File and save it back into the array.

I am not sure why this script does not work? Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-05T04:32:34+00:00

Your script does NOT work because grep uses $_ and takes over (for the duration of grep) the old value of your $_ from the loop (e.g. the variable $_ you use in the regex is NOT the variable used for storing the loop value in the while block – they are named the same, but have different scopes).

Use a named variable instead (as a rule, NEVER use $_ for any code longer than 1 line, precisely to avoid this type of bug):

while (my $line=<SMALL>) {
    chomp $line;
    @contents = grep !/^\Q$line/, @contents;
}

However, as Oleg pointed out, a more efficient solution is to read small file’s lines into a hash and then process the big file ONCE, checking hash contents (I also improved the style a bit – feel free to study and use in the future, using lexical filehandle variables, 3-arg form of open and IO error printing via $!):

#! /usr/bin/perl

use strict;
use warnings;

my ($small, $big, $output) = @ARGV;

use File::Slurp;
my @small = read_file($small);
my %small = map { ($_ => 1) } @small;

open(my $big, "<", $big) or die "Can not read $big: Error: $!\n";
open(my $output, ">", $output) or die "Can not write to $output: Error: $!\n";

while(my $line=<$big>) {
    chomp $line;
    next if $small{$line}; # Skip common
    print $output "$line\n";
}

close($big);
close($output);

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have 2 files, a small one and a big one. The small file

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply