Is there any fast and memory efficient way to read specific lines of large file, without loading it to memory?
I wrote a perl script, that runs many forks and I would like them to read specific lines from a file.
At the moment Im using an external command:
sub getFileLine {
my ( $filePath, $lineWanted ) = @_;
$SIG{PIPE} = '_IGNORE_';
open( my $fh, '-|:utf8', "tail -q -n +$lineWanted \"$filePath\" | head -n 1" );
my $line = <$fh>;
close $fh;
chomp( $line );
return $line;
}
Its fast and it works – but maybe there’s a more “Perl-ish” way, as fast and as memory efficient as this one?
As you know, creating a fork process in Perl duplicates the main process memory – so if the main process is using 10MB, the fork will use at least that much.
My goal is to keep fork process (so main process until running forks also) memory use as low as possible. Thats why I dont want to load the whole file into memory.
Before you go further, it’s important to understand how
forkworks. When youforka process, the OS uses copy-on-write semantics to share the bulk of the parent and child processes’ memory; only the amount of memory that differs between the parent and child need to be separately allocated.For reading a single line of a file in Perl, here’s a simple way:
This uses the special
$.variable which holds the line number of the current filehandle.