Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7902665
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 3, 20262026-06-03T09:33:46+00:00 2026-06-03T09:33:46+00:00

I have HTTP header request and reply data in tab delimited form with each

  • 0

I have HTTP header request and reply data in tab delimited form with each GET/POST and reply in different lines. This data is such that there are multiple GET, POST and REPLY for one TCP flow. I need to choose only the first valid GET – REPLY pair out of these cases. An example (simplified) is:

ID       Source    Dest    Bytes   Type   Content-Length  host               lines.... 
1         A         B       10     GET        NA          yahoo.com            2
1         A         B       10     REPLY      10          NA                   2 
2         C         D       40     GET        NA          google.com           4
2         C         D       40     REPLY      20          NA                   4
2         C         D       40     GET        NA          google.com           4
2         C         D       40     REPLY      30          NA                   4
3         A         B       250    POST       NA          mail.yahoo.com       5
3         A         B       250    REPLY      NA          NA                   5
3         A         B       250    REPLY      15          NA                   5
3         A         B       250    GET        NA          yimg.com             5
3         A         B       250    REPLY      35          NA                   5
4         G         H       415    REPLY      10          NA                   6
4         G         H       415    POST       NA          facebook.com         6
4         G         H       415    REPLY      NA          NA                   6
4         G         H       415    REPLY      NA          NA                   6
4         G         H       415    GET        NA          photos.facebook.com  6
4         G         H       415    REPLY      50          NA                   6

....

So, basically I need to get one request-reply pair for each ID and write them to a new file.

For ‘1’ it is just one pair so it is easy. But there are also false cases with both lines being a GET, POST or REPLY. So, such cases are ignored.

For ‘2’, I would choose the first GET – REPLY pair.

For ‘3’, I would choose the first GET but the second REPLY as the Content-Length is absent in the first (making the subsequest REPLY a better candidate).

For ‘4’, I would choose the first POST (or GET) as the first header cannot be REPLY. I would not choose the REPLY after the second GET even though the content length is missing in ones after the POST., as the REPLY comes after that. So I would just choose the first REPLY.

So, after choosing the best request and reply pair, I need to pair them up in a single line. For the example, the output would be:

 ID       Source    Dest    Bytes   Type   Content-Length  host         .... 
   1         A         B       10     GET      10          yahoo.com
   2         C         D       40     GET      20          google.com
   3         A         B       250    POST     15          mail.yahoo.com
   4         G         H       415    POST     NA          facebook.com

There are a lot of other headers in the actual data but this example pretty much shows what I need. How would one do this in Perl? I pretty much am stuck in the beginning so I have only been able to read the file one line at a time.

open F, "<", "file.txt" || die "Cannot open $f: $!";

  while (<F>) {
    chomp;
    my @line = split /\t/;


      # get the valid pairs for cases with multiple request - replies


      # get the paired up data together

  }
  close (F);

*Edit: I have added an additional column giving the number of HTTP header lines for each ID. This may help to know how many subsequent lines to check. Also, I modified ID ‘4’ so that the first header line is a REPLY. *

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-03T09:33:48+00:00Added an answer on June 3, 2026 at 9:33 am

    The program below does what I think you need.

    It is commented and I think it is fairly legible. Please ask if anything is unclear.

    use strict;
    use warnings;
    
    use List::Util 'max';
    
    my $file = $ARGV[0] // 'file.txt';
    open my $fh, '<', $file or die qq(Unable to open "$file" for reading: $!);
    
    # Read the field names from the first line to index the hashes
    # Remember where the data in the file starts so we can get back here
    #
    my @fields = split ' ', <$fh>;
    my $start = tell $fh;
    
    # Build a format to print the accumulated data
    # Create a hash that relates column headers to their widths
    #
    my @headers = qw/ ID Source Dest Bytes Type Content-Length host /;
    my %len = map { $_ => length } @headers;
    
    # Read through the file to find the maximum data width for each column
    #
    while (<$fh>) {
      my %data;
      @data{@fields} = split;
      next unless $data{ID} =~ /^\d/;
      $len{$_} = max($len{$_}, length $data{$_}) for @headers;
    }
    
    # Build a format string using the values calculated
    #
    my $format = join '   ', map sprintf('%%%ds', $_), @len{@headers};
    $format .= "\n";
    
    # Go back to the start of the data
    # Print the column headers
    #
    seek $fh, $start, 0;
    printf $format, @headers;
    
    # Build transaction data hashes into $record and print them
    # Ignore any events before the first request
    # Ignore the second request and anything after it
    # Update the stored Content-Length field if a value other than NA appears
    #
    my $record;
    my $nreq = 0;
    
    while (<$fh>) {
    
      my %data;
      @data{@fields} = split;
      my ($id, $type) = @data{ qw/ ID Type / };
      next unless $id =~ /^\d/;
    
      if ($record and $id ne $record->{ID}) {
        printf $format, @{$record}{@headers};
        undef $record;
        $nreq = 0;
      }
    
      if ($type eq 'GET' or $type eq 'POST') {
        $record = \%data if $nreq == 0;
        $nreq++;
      }
      elsif ($nreq == 1) {
        if ($record->{'Content-Length'} eq 'NA' and $data{'Content-Length'} ne 'NA') {
          $record->{'Content-Length'} = $data{'Content-Length'};
        }
      }
    }
    
    printf $format, @{$record}{@headers} if $record;
    

    output

    With the data given in the question, this program produces

    ID   Source   Dest   Bytes    Type   Content-Length                  host
     1        A      B      10     GET               10             yahoo.com
     2        C      D      40     GET               20            google.com
     3        A      B     250    POST               15        mail.yahoo.com
     4        G      H     415    POST               NA          facebook.com
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

i have this header at the top: http://yoursdproperty.com/ how do i move the swf
I have this code for reading webpage. I need to send http header to
I'm trying to send a HTTP GET request with UDP (since the reply from
I have a need to pass an HTTP header for each an every RIA
I have made a http request using Firefox.Now the request header shows the following:
This is JavaScript regex. regex = /(http:\/\/[^\s]*)/g; text = I have http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd and I
I have long HTTP request ( generating large Excel file - about 60K records
Alright, so I have this bit of code, and when I make the request,
I want to capture the HTTP request header fields, primarily the Referer and User-Agent,
I have a header('HTTP/1.0 404 Not Found'); somewhere along the code but it doesn't

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.