Have few thousand reports that have consistently formatted tabular data embedded within them that

Question

0

Asked: May 22, 20262026-05-22T03:03:35+00:00 2026-05-22T03:03:35+00:00

Have few thousand reports that have consistently formatted tabular data embedded within them that

0

Have few thousand reports that have consistently formatted tabular data embedded within them that I need to extract.

Have a few ideas, but thought I’d post to see if there’s a better way to do this than what I’m thinking; which is to extract the tabular data, create a new file for it, then parse that data as a tabular file.

Here’s a sample input and output, where the output read and written row by row to a database.

INPUT_FILE

MiscText MiscText MiscText
MiscText MiscText MiscText
MiscText MiscText MiscText
SubHeader
PASS    1283019238  alksdjalskdjl
FAIL    102310928301    kajdlkajsldkaj
PASS    102930192830    aoisdajsdoiaj
PASS    192830192301    jiasdojoasi
MiscText MiscText MiscText
MiscText MiscText MiscText
MiscText MiscText MiscText

OUTPUT (read/write row-by-row from text-file to DB)

ROW-01{column01,column02,column03}
...
ROW-nth{column01,column02,column03}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T03:03:35+00:00

Recognizing when to start processing tabular data is easy. You’ve got the marker line. The difficulty is recognizing when to stop processing data. You can apply the heuristics of stopping to process data when the split doesn’t yield the expected result.

use strict;
use warnings;
my $tab_data;
my $num_cols;
while ( <> ) {
    $tab_data = 1, next if $_ eq "SubHeader\n";
    next unless $tab_data;
    chomp;
    my @cols = split /\t/;
    $num_cols ||= scalar @cols;
    last if $num_cols and $num_cols != scalar @cols;
    print join( "\t", @cols ), "\n";
}

Save as etd.pl (etd = extract tabular data, what did you think?), and call it like this from the command line:

perl etd.pl < your-mixed-input.txt

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Have few thousand reports that have consistently formatted tabular data embedded within them that

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply