Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6889549
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 27, 20262026-05-27T06:10:14+00:00 2026-05-27T06:10:14+00:00

I am newbie to Perl. I need to parse a tab separated text file.

  • 0

I am newbie to Perl. I need to parse a tab separated text file. For example:

From name   To name      Timestamp                 Interaction
a             b        Dec  2 06:40:23 IST 2000        comment
c             d        Dec  1 10:40:23 IST 2001          like
e             a        Dec  1 16:03:01 IST 2000         follow
b             c        Dec  2 07:50:29 IST 2002         share
a             c        Dec  2 08:50:29 IST 2001        comment
c             a        Dec 11 12:40:23 IST 2008          like
e             c        Dec  2 07:50:29 IST 2000         like
c             b        Dec 11 12:40:23 IST 2008        follow
b             a        Dec  2 08:50:29 IST 2001        share

After parsing I need to create groups base upon users interaction. In this example

a<->b
b<->a
c<->a
a<->c
b<->c
c<->b

for this we can create one group. and we need to display list of groups.
I need some pointers on how to parse the file and form group?

Edit
Constraint-> at least 3 user required for creating group.
Interaction is nothing but some communication is done between two user. It does not matter of which communication

My Approach for solving is

  1. We remove repeated interaction between users . such as “a<>b like “again if “a<>b follow” is present then we remove this row.

  2. Creating 2 dimensional array which store interaction two users i.e

              To Name   a       b        c          d
    

    From Name

       a                X       <>       <>         X
       b                <>      X        <>         X
       c                <>      <>       X          X 
       d                X       <>       X          X
    

    X= Represent no interaction
    <>= represent interaction

In this approach we start from first row i.e “a” user check with “b”. if “a” is interact with “b” then we perform reverse of i.e “b” interact with “a”. same steps perform for each column.

But this approach depends on number of users. If 1000 users are present then we have to create 1000 X 1000 matrix. IS there any alternative to solve this

I have added sample input

a   c   Dec  2 06:40:23 IST 2000    comment
f   g   Dec  2 06:40:23 IST 2009    like
c   a   Dec  2 06:40:23 IST 2009    like
g   h   Dec  2 06:40:23 IST 2008    like
a   d   Dec  2 06:40:23 IST 2008    like
r   t   Dec  2 06:40:23 IST 2007    share
d   a   Dec  2 06:40:23 IST 2007    share
t   u   Dec  2 06:40:23 IST 2006    follow
a   e   Dec  2 06:40:23 IST 2006    follow
k   l   Dec  2 06:40:23 IST 2009    like
e   a   Dec  2 06:40:23 IST 2009    like
j   k   Dec  2 06:40:23 IST 2003    like
c   d   Dec  2 06:40:23 IST 2003    like
l   j   Dec  2 06:40:23 IST 2002    like
d   c   Dec  2 06:40:23 IST 2002    like
m   n   Dec  2 06:40:23 IST 2005    like
c   e   Dec  2 06:40:23 IST 2005    like
m   l   Dec  2 06:40:23 IST 2011    like
e   c   Dec  2 06:40:23 IST 2011    like
h   j   Dec  2 06:40:23 IST 2010    like
d   e   Dec  2 06:40:23 IST 2010    like
o   p   Dec  2 06:40:23 IST 2009    like
e   d   Dec  2 06:40:23 IST 2009    like
p   q   Dec  2 06:40:23 IST 2000    comment
q   p   Dec  2 06:40:23 IST 2009    like
a   p   Dec  2 06:40:23 IST 2008    like    
p   a   Dec  2 06:40:23 IST 2007    share
l   p   Dec  2 06:40:23 IST 2003    like
j   l   Dec  2 06:40:23 IST 2002    like
t   r   Dec  2 06:40:23 IST 2000    comment
r   h   Dec  2 06:40:23 IST 2009    like
j   f   Dec  2 06:40:23 IST 2008    like    
g   d   Dec  2 06:40:23 IST 2007    share
w   q   Dec  2 06:40:23 IST 2003    like
o   y   Dec  2 06:40:23 IST 2002    like
x   y   Dec  2 06:40:23 IST 2000    comment
y   x   Dec  2 06:40:23 IST 2009    like
x   z   Dec  2 06:40:23 IST 2008    like    
z   x   Dec  2 06:40:23 IST 2007    share
y   z   Dec  2 06:40:23 IST 2003    like
z   y   Dec  2 06:40:23 IST 2002    like

Output should be:

(a,c, d, e)
(x,y,z)
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-27T06:10:15+00:00Added an answer on May 27, 2026 at 6:10 am

    Parsing is easy. Just a split /\t/ might be enough. However, Text::xSV or Text::CSV might be better.

    For the connections, you can use the Graph module. To be able to use that module effectively, you need to understand at least the basics of graph theory.

    Note that a strongly connected component is defined as:

    A directed graph is called strongly connected if there is a path from each vertex in the graph to every other vertex. In particular, this means paths in each direction; a path from a to b and also a path from b to a.

    The strongly connected components of a directed graph G are its maximal strongly connected subgraphs.

    However, note that if you have a <-> b and b <-> c, a, b, and c will form a strongly connected component meaning that is a weaker requirement than all members of a group interacted with each other in both directions.

    We can still use this to reduce the search space. Once you have candidate groups, you can then check each to see if it fits your definition of a group. If a candidate group does not meet your requirements, then you can check all subsets with one fewer members. If you don’t find any groups among those, you can then look at all subsets with two fewer members and so on until you hit the minimum group size limit.

    The script below uses this idea. However, it very likely won’t scale. I strongly suspect one might be able to put together some SQL magic but my mind is far too limited for that.

    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    
    use Graph;
    use Algorithm::ChooseSubsets;
    
    use constant MIN_SIZE => 3;
    
    my $interactions = Graph->new(
        directed => 1,
    );
    
    while (my $interaction = <DATA>) {
        last unless $interaction =~ /\S/;
        my ($from, $to) = split ' ', $interaction, 3;
    
        $interactions->add_edge($from, $to);
    }
    
    my @groups = map {
        is_group($interactions, $_) ? $_
                                    : check_subsets($interactions, $_)
    } grep @$_ >= MIN_SIZE, $interactions->strongly_connected_components;
    
    
    print "Groups: \n";
    print "[ @$_ ]\n" for @groups;
    
    sub check_subsets {
        my ($graph, $candidate) = @_;
    
        my @groups;
        for my $size (reverse MIN_SIZE .. (@$candidate - 1)) {
            my $subsets = Algorithm::ChooseSubsets->new(
                set => $candidate,
                size => $size,
            );
    
            my $groups_found;
            while (my $subset = $subsets->next) {
                if (is_group($interactions, $subset)) {
                    ++$groups_found;
                    push @groups, $subset;
                }
            }
            last if $groups_found;
        }
    
        return @groups;
    }
    
    sub is_group {
        my ($graph, $candidate) = @_;
    
        for my $member (@$candidate) {
            for my $other (@$candidate) {
                next if $member eq $other;
                return unless $graph->has_edge($member, $other);
                return unless $graph->has_edge($other, $member);
            }
        }
    
        return 1;
    }
    
    __DATA__
    a   c   Dec  2 06:40:23 IST 2000    comment
    f   g   Dec  2 06:40:23 IST 2009    like
    c   a   Dec  2 06:40:23 IST 2009    like
    g   h   Dec  2 06:40:23 IST 2008    like
    a   d   Dec  2 06:40:23 IST 2008    like
    r   t   Dec  2 06:40:23 IST 2007    share
    d   a   Dec  2 06:40:23 IST 2007    share
    t   u   Dec  2 06:40:23 IST 2006    follow
    a   e   Dec  2 06:40:23 IST 2006    follow
    k   l   Dec  2 06:40:23 IST 2009    like
    e   a   Dec  2 06:40:23 IST 2009    like
    j   k   Dec  2 06:40:23 IST 2003    like
    c   d   Dec  2 06:40:23 IST 2003    like
    l   j   Dec  2 06:40:23 IST 2002    like
    d   c   Dec  2 06:40:23 IST 2002    like
    m   n   Dec  2 06:40:23 IST 2005    like
    c   e   Dec  2 06:40:23 IST 2005    like
    m   l   Dec  2 06:40:23 IST 2011    like
    e   c   Dec  2 06:40:23 IST 2011    like
    h   j   Dec  2 06:40:23 IST 2010    like
    d   e   Dec  2 06:40:23 IST 2010    like
    o   p   Dec  2 06:40:23 IST 2009    like
    e   d   Dec  2 06:40:23 IST 2009    like
    p   q   Dec  2 06:40:23 IST 2000    comment
    q   p   Dec  2 06:40:23 IST 2009    like
    a   p   Dec  2 06:40:23 IST 2008    like
    p   a   Dec  2 06:40:23 IST 2007    share
    l   p   Dec  2 06:40:23 IST 2003    like
    j   l   Dec  2 06:40:23 IST 2002    like
    t   r   Dec  2 06:40:23 IST 2000    comment
    r   h   Dec  2 06:40:23 IST 2009    like
    j   f   Dec  2 06:40:23 IST 2008    like
    g   d   Dec  2 06:40:23 IST 2007    share
    w   q   Dec  2 06:40:23 IST 2003    like
    o   y   Dec  2 06:40:23 IST 2002    like
    x   y   Dec  2 06:40:23 IST 2000    comment
    y   x   Dec  2 06:40:23 IST 2009    like
    x   z   Dec  2 06:40:23 IST 2008    like
    z   x   Dec  2 06:40:23 IST 2007    share
    y   z   Dec  2 06:40:23 IST 2003    like
    z   y   Dec  2 06:40:23 IST 2002    like
    

    Output:

    Groups:
    [ y z x ]
    [ e d a c ]
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Perl newbie here. I have a log file that I need to parse out
Iam a perl newbie and need help in understanding the below piece of code.
I'm trying to write a perl script to search through a text file, find
I am newbie to perl cross compilation. I'm trying to install perl from source
Newbie question... The objective: I intend to have an HTML text input field as
newbie for clearcase. Since clearcase's config is rather different from other concept in git,
Perl newbie here...I had help with this working perl script with some HASH code
I am learning Perl in a head-first manner. I am absolutely a newbie in
After asking this perl newbie question , I have a perl newbie follow-up. I
I am a perl newbie, Can I simply use 64-bit arithmetic in Perl? For

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.