Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 984637
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T05:02:51+00:00 2026-05-16T05:02:51+00:00

I have a working perl script that grabs the data I need and displays

  • 0

I have a working perl script that grabs the data I need and displays them to STDOUT, but now I need to change it to generate a data file (csv, tab dellimited, any delimiter file).
The regular expression is filtering the data that I need, but I don’t want the entire string, just snippets of the output. I’m assuming I would need to store this in another variable to create my output file.

I need a good example of this or suggestions to alter this code. Thank you in advance. 🙂

Here’s my code:

#!/usr/bin/perl -w
# Usage: ./bakstatinfo.pl Jul 28 2010 /var/log/mybackup.log <server1> <server2>

use strict;
use warnings;

#This piece added to view the arguments passed in
$" = "][";
print "===================================================================================\n";
print "[@ARGV]\n";

#Declare Variables
my($mon,$day,$year,$file,$server) = @ARGV;
my $regex_flag = 0;                 

splice(@ARGV, 0, 4, ());            

foreach my $server ( @ARGV ) {      #foreach will take Xn of server entries and add to the loop
    print "===================================================================================\n";
    print "REPORTING SUMMARY for SERVER : $server\n";
    open(my $fh,"ssh $server cat $file |") or die "can't open log $server:$file: $!\n";
    while (my $line = <$fh>) {
        if ($line =~ m/.* $mon $day \d{2}:\d{2}:\d{2} $year:.*(ERROR:|backup-date=|backup-size=|backup-time=|backup-status)/) {
            print $line;
            $regex_flag=1; #Set to true
        }
    }
        if ($regex_flag==0) { 
           print "NOTHING TO REPORT FOR $server: $mon $day $year \n";
        }
    $regex_flag=0; 
    close($fh);
}

Sample raw log file I am using: (recently added to provide better representation of log)

Tue Jul 27 23:00:06 2010: test202.bak_lvm:backup:ERROR: mybak-abc appears to be already running for this backupset
Tue Jul 27 23:00:06 2010: test202.bak_lvm:backup:ERROR: If you are sure mybak-abc is not running, please remove the file /etc/mybak-abc/test202.bak_lvm/.mybak-abc.pid and restart mybak-abc
Tue Jul 27 23:00:06 2010: test202.bak_lvm:backup:INFO: PHASE START: Cleanup
Tue Jul 27 23:00:06 2010: test202.bak_lvm:backup:INFO: PHASE END: Cleanup
Tue Jul 27 23:00:06 2010: test202.bak_lvm:backup:INFO: END OF BACKUP
Wed Jul 28 00:00:04 2010: db9.abc.bak:backup:INFO: START OF BACKUP
Wed Jul 28 00:00:04 2010: db9.abc.bak:backup:INFO: PHASE START: Initialization
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:WARNING: Binary logging is off.
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: License check successful
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: License check successful for lvm-snapshot.pl
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-set=db9.abc.bak
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-date=20100728000004
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: SQL-server-os=Linux/Unix
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-type=regular
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: host=db9.abc.bak.test.com
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-date-epoch=1280300404
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: retention-policy=3D
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: mybak-abc-version=ABC for SQL Enterprise Edition - version 3.1
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: SQL-version=5.1.32-test-SMP-log
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-directory=/home/backups/db9.abc.bak/20100728000004
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-level=0
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: backup-mode=raw
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: PHASE END: Initialization
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: PHASE START: Running pre backup plugin
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: PHASE START: Flushing logs
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: PHASE END: Flushing logs
Wed Jul 28 00:00:05 2010: db9.abc.bak:backup:INFO: PHASE START: Creating snapshot based backup
Wed Jul 28 00:00:11 2010: db9.abc.bak:backup:INFO: Wed Jul 28 00:49:53 2010: test203.bak_lvm:backup:INFO: raw-databases-snapshot=test SQL sgl 
Wed Jul 28 00:49:53 2010: test203.bak_lvm:backup:INFO: PHASE END: Creating snapshot based backup
Wed Jul 28 00:49:53 2010: test203.bak_lvm:backup:INFO: PHASE START: Calculating backup size & checksums 
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: last-backup=/home/backups/test203.bak_lvm/20100726200004
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-size=417.32 GB
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: PHASE END: Calculating backup size & checksums 
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: read-locks-time=00:00:05
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: flush-logs-time=00:00:00
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-time=04:49:51
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-status=Backup succeeded

My working output now:

===================================================================================
[Jul][28][2010][/var/log/mybackup.log][server1]
===================================================================================
REPORTING SUMMARY for SERVER : server1
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-size=417.32 GB
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-time=04:49:51
Wed Jul 28 00:49:54 2010: test203.bak_lvm:backup:INFO: backup-status=Backup succeeded

The output I need to see would be something like this:(data file with separated by ‘;’ for example)

MyDate=Wed Jul 28;MyBackupSet= test203.bak_lvm;MyBackupSize=187.24 GB;MyBackupTime=04:49:51;MyBackupStat=Backup succeeded
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T05:02:51+00:00Added an answer on May 16, 2026 at 5:02 am

    Use ‘capturing parentheses’ to identify the bits you want to deal with.

       if ($line =~ m/(.* $mon $day) \d{2}:\d{2}:\d{2} $year:.*
                      (ERROR:|backup-date=|backup-size=|
                       backup-time=|backup-status)/x) {
    

    You will need to do some surgery on the second set of parentheses – those surrounding the start of the various keywords. You may have to chop those out in bits and pieces inside the condition.

    When you have all the data extracted into variables, use Text::CSV to handle CSV output (and input).

    There are a myriad modules to handle HTML or XML (over 2000, and I think over 3000, with HTML in their name – I happened to look yesterday). Many of those won’t be applicable, but CPAN is your friend.


    Answering questions posed by comments

    Would I split them off into separate variables as well? The first part gives me the date/time that I need. The next filter then gives me 1) Error: 2)backup-date= 3)backup-size= …etc.

    More or less. Unfortunately, you don’t show some representative input lines, which means it is hard to tell what might be best. However, it seems likely that a scheme such as:

    while (my $line = <$fh>)
    {
        chomp $line;
        if ($line =~ m/(.* $mon $day) \d\d:\d\d:\d\d $year:/)
        {
            my $date = $1;
            my %items = ();
            $line =~ s/.* $mon $day \d\d:\d\d:\d\d $year://;
            while ($line =~ m/(ERROR|backup-date|backup-size|
                               backup-time|backup-status)
                              [:=]([^:]+)/x)
            {
                my $key = $1;
                my $val = $2;
                $items{$key} = $val;
                $line =~ s/$key[:=]$val[:=]?//;
            }
            # The %items hash contains the split out information.
            # Now write the data for this line of the log file.
        }
    }
    

    There might well be better ways to handle the trimming (but it is Perl so TMTOWTDI), but the basic idea here is to catch the lines that are interesting, then progressively chop the bits of interest out of the line, so the line grows shorter on each iteration (therefore, eventually terminating the inner while loop).

    Note the use of the /x modifier to allow for a more readable regex split over lines (I edited the original answer version to use that too). I’ve also allowed ‘ERROR’ to be followed by an ‘=‘ or the other keywords to be followed by ‘:‘; it seems unlikely that you’d get false matches that way, and it simplifies the regex substitute operations. The initial pattern match no longer requires one of the subsections to be present, either. You must judge for yourself whether those small changes (which might pick up non-conforming information) matter or not. For most of my purposes, the chance of the mismatch is small enough not to be an issue – but for legal reasons, it might not be acceptable to you.


    Answering questions posed by ‘answer’

    I manufactured some data:

    Wed Jul 30 00:49:51 2010: test203.bak_lvm:backup:INFO: backup-size=417.32 GB
    Wed Jul 30 00:49:52 2010: test203.bak_lvm:backup:INFO: backup-time=04:49:51
    Wed Jul 30 00:49:53 2010: test203.bak_lvm:backup:INFO: backup-status=Backup succeeded
    Wed Jul 30 00:49:51 2010: backup-size=417.32 GB:backup-time=04:49:51:backup-status=Backup succeeded
    

    I took the script in the answer and hacked and instrumented it – making it standalone.
    I also removed the dependency on specific files – it reads standard input and writes to standard output. It makes my testing easier – and the code more flexible.

    use strict;
    use warnings;
    use constant debug => 0;
    my $mon = 'Jul';
    my $day = 30;
    my $year = 2010;
    
    while (my $line = <>)
    {
        chomp $line;
        print "Line: $line\n" if debug;
        if ($line =~ m/(.* $mon $day) \d\d:\d\d:\d\d $year:/) #Mon Jul 26 22:00:02 2010:
        {
            print "### Scan\n";
            my $date = $1;
            print "$date\n";
            my %items = ();
            $line =~ s/.* $mon $day \d\d:\d\d:\d\d $year://;
            print "Line: $line\n" if debug;
            while ($line =~ m/(ERROR|backup-date|backup-size|backup-time|backup-status)[:=]([^:]+)/)
            {
                my $key = $1;
                my $val = $2;
                $items{$key} = $val;
                $line =~ s/$key[:=]$val[:=]?//;
                print "$key=$val\n";
                print "Line: $line\n" if debug;
            }
            print "### Verify\n";
            for my $key (sort keys %items)
            {
                print "$key = $items{$key}\n";
            }
        }
    }
    

    The output I get is:

    ### Scan
    Wed Jul 30
    backup-size=417.32 GB
    ### Verify
    backup-size = 417.32 GB
    ### Scan
    Wed Jul 30
    backup-time=04
    ### Verify
    backup-time = 04
    ### Scan
    Wed Jul 30
    backup-status=Backup succeeded
    ### Verify
    backup-status = Backup succeeded
    ### Scan
    Wed Jul 30
    backup-size=417.32 GB
    backup-time=04
    backup-status=Backup succeeded
    ### Verify
    backup-size = 417.32 GB
    backup-status = Backup succeeded
    backup-time = 04
    

    The verify loop prints out the data from the ‘%items‘ hash quite happily. With the debug value set to 1 instead of 0, the output I get is:

    Line: Wed Jul 30 00:49:51 2010: test203.bak_lvm:backup:INFO: backup-size=417.32 GB
    ### Scan
    Wed Jul 30
    Line:  test203.bak_lvm:backup:INFO: backup-size=417.32 GB
    backup-size=417.32 GB
    Line:  test203.bak_lvm:backup:INFO: 
    ### Verify
    backup-size = 417.32 GB
    Line: Wed Jul 30 00:49:52 2010: test203.bak_lvm:backup:INFO: backup-time=04:49:51
    ### Scan
    Wed Jul 30
    Line:  test203.bak_lvm:backup:INFO: backup-time=04:49:51
    backup-time=04
    Line:  test203.bak_lvm:backup:INFO: 49:51
    ### Verify
    backup-time = 04
    Line: Wed Jul 30 00:49:53 2010: test203.bak_lvm:backup:INFO: backup-status=Backup succeeded
    ### Scan
    Wed Jul 30
    Line:  test203.bak_lvm:backup:INFO: backup-status=Backup succeeded
    backup-status=Backup succeeded
    Line:  test203.bak_lvm:backup:INFO: 
    ### Verify
    backup-status = Backup succeeded
    Line: Wed Jul 30 00:49:51 2010: backup-size=417.32 GB:backup-time=04:49:51:backup-status=Backup succeeded
    ### Scan
    Wed Jul 30
    Line:  backup-size=417.32 GB:backup-time=04:49:51:backup-status=Backup succeeded
    backup-size=417.32 GB
    Line:  backup-time=04:49:51:backup-status=Backup succeeded
    backup-time=04
    Line:  49:51:backup-status=Backup succeeded
    backup-status=Backup succeeded
    Line:  49:51:
    ### Verify
    backup-size = 417.32 GB
    backup-status = Backup succeeded
    backup-time = 04
    

    The substitute operations delete the previously matched part of the line. There are ways of continuing a match where you left off – see \G at the ‘perlre’ page.

    Note that the regex is crafted to stop at the first colon after the ‘colon or equals’ after the keyword. That means it truncates the backup time. One moral is “do not use a separator that can appear in the data”. Another is “provide sample data so people can help you more easily”. Another is “provide complete but minimal working scripts where possible”.


    Processing the sample data

    Now that we have the sample input data, we can see that you need slightly different processing. This script:

    use strict;
    use warnings;
    use constant debug => 0;
    my $mon = 'Jul';
    my $day = 28;
    my $year = 2010;
    my %items = ();
    
    while (my $line = <>)
    {
        chomp $line;
        print "Line: $line\n" if debug;
        if ($line =~ m/(.* $mon $day) \d\d:\d\d:\d\d $year: ([^:]+):backup:/) #Mon Jul 26 22:00:02 2010:
        {
            print "### Scan\n" if debug;
            my $date = $1;
            my $set = $2;
            print "$date ($set): " if debug;
            $items{$set}->{'a-logdate'} = $date;
            $items{$set}->{'a-dataset'} = $set;
            if ($line =~ m/(ERROR|backup-date|backup-size|backup-time|backup-status)[:=](.+)/)
            {
                my $key = $1;
                my $val = $2;
                $items{$set}->{$key} = $val;
                print "$key=$val\n" if debug;
            }
        }
    }
    
    print "### Verify\n";
    for my $set (sort keys %items)
    {
        print "Set: $set\n";
        my %info = %{$items{$set}};
        for my $key (sort keys %info)
        {
            printf "%s=%s;", $key, $info{$key};
        }
        print "\n";
    }
    

    produces this result on the sample data file.

    ### Verify
    Set: db9.abc.bak
    a-dataset=db9.abc.bak;a-logdate=Wed Jul 28;backup-date=20100728000004;
    Set: test203.bak_lvm
    a-dataset=test203.bak_lvm;a-logdate=Wed Jul 28;backup-size=417.32 GB;backup-status=Backup succeeded;backup-time=04:49:51;
    

    Note that now we have sample data, we can see that there is only one key/value pair per line, but there are multiple systems backed up per day. So, the inner while loop becomes a simple if. The printing out occurs at the end. And I’m using a ‘two-tier’ hash. The %items contains an entry for each data set; the entry, though, is a reference to a hash. Not necessarily something for novices to play with, but it fell into place very naturally with the previous code. Note, too, that this version doesn’t hack the line – there’s no need since there’s only one lot of data per line.

    Can it be improved – yes, undoubtedly. Does it work? Yes, more or less… Can it be hacked into shape? Yes, it can be hacked to work as you need.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.