Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 4574718
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 21, 20262026-05-21T19:57:42+00:00 2026-05-21T19:57:42+00:00

I should explain as background to this question that I don’t know any Perl,

  • 0

I should explain as background to this question that I don’t know any Perl, and have a violent allergy to regular expressions
(we all have our weaknesses). I’m trying to figure out why a Perl program won’t accept the data I’m feeding it. I don’t need to understand this program in any depth – I’m just doing a timing comparison.

Consider this assignment statement:

($sample_ls_id) = $sample_ls_id =~ /:\w\w(\d+):/;

If I understand this correctly, it is checking if sample_ls_id matches some regex, and if so, assigning the entire string, or something like that.

However, I don’t understand how this works.
According to the documentation, namely perldoc perlretut, which I looked at briefly

$sample_ls_id =~ /:\w\w(\d+):/

just returns true or false if there is a match.

The strings I’m trying to match look like

1000    10      0       0       1        urn:lsid:dcc.hapmap.org:Individual:CEPH1000.10:1        urn:lsid:dcc.hapmap.org:Sample:SAMPLE1:1

This fails with the error

Use of uninitialized value $sample_ls_id in concatenation (.) or string
at database/populate/family.pl line 38, <INPUT> line 1.

Line 38 is

print OUTPUT "$sample_ls_id\t$family_ped_id\t$individual_ped_id\t$father_ped_id\t$mother_ped_id\t$sex\t$created_by\t$population_code\n";

See the complete script below. However, the apparently very similar string

1420    9       0       0       1       urn:lsid:dcc.hapmap.org:Individual:CEPH1420.09:1  urn:lsid:dcc.hapmap.org:Sample:NA12003:1

seems to pass.

For context, the entire piece of code is:

use strict;
use warnings;
use Getopt::Long;

my $input_file = "data/family_ceu.txt";
my $output_file = "sql/family_ceu.sql";
my $population_code = "CEU";

GetOptions ('i=s' => \$input_file,
            'o=s' => \$output_file,
            'p=s' => \$population_code
            );

usagecheck();

my $created_by = 'gwas_analyzer';

print "Creating SQL file for inserting family data from $input_file\n";

open (INPUT, "< $input_file");
open (OUTPUT, "> $output_file");

print OUTPUT "INSERT INTO population (population_code, private) VALUES ('$population_code', 'f');\n";
print OUTPUT "COPY family (ls_id, family_ped_id, individual_ped_id, father_ped_id, mother_ped_id, sex, created_by, population_code) FROM stdin;                      
";

while (my $line = <INPUT>)
{
    chomp $line;

    #Skip any comment lines 
    next if($line =~ /^#/);

    my ($family_ped_id, $individual_ped_id, $father_ped_id, $mother_ped_id, $sex, $individual_ls_id, $sample_ls_id) = split (/\t/, $line);

    ($sample_ls_id) = $sample_ls_id =~ /:\w\w(\d+):/;

    print OUTPUT "$sample_ls_id\t$family_ped_id\t$individual_ped_id\t$father_ped_id\t$mother_ped_id\t$sex\t$created_by\t$population_code\n";
}

print OUTPUT "\\.\n";
close OUTPUT;

sub usagecheck
{
    if (!$input_file || !$output_file || !$population_code)
    {
        print "Missing argument (see required arguments below):\n";
        usage();
        exit;
    }
}

sub usage
{
    print "perl family.pl -i <input file> -o <output file> -p <population code>\n";
}

I’m sure this is a very simple question if you know regexes and Perl.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-21T19:57:43+00:00Added an answer on May 21, 2026 at 7:57 pm

    When $sample_ls_id = 'urn:lsid:dcc.hapmap.org:Sample:SAMPLE1:1';

    The regular expression ‘/:\w\w(\d+):/;’ fails. This regular expression would pass when the string has a colon ‘:’ followed by a “word” character ‘\w’,
    another “word” character ‘\w’ followed by one or more digits ‘\d+’ and a colon ‘:’.

    When $sample_ls_id = 'urn:lsid:dcc.hapmap.org:Sample:NA12003:1';

    The regular expression ‘/:\w\w(\d+):/;’ finds its match in
    ‘:NA12003:’. ( colon, 2 word characters, digits and a colon ).

    my $sample_id = 'urn:lsid:dcc.hapmap.org:Sample:NA12003:1'
    ($sample_ls_id) = $sample_ls_id =~ /:\w\w(\d+):/;
    

    ‘( $sample_ls_id )’ captures the ‘(\d+)’ portion of the match ( also stored in $1 ), which in this case would be 12003.

    You were getting an error with the earlier example, because the regular expression fails and leaves ‘($sample_ls_id)’ undefined.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I was recently trying to explain to a programmer why, in ASP.Net, they should
If you think it shouldn't, explain why. If yes, how deep should the guidelines
I have a Java background so I’m used to having Maven handle all problem
Original question: An affiliate partner of us has a website that is vulnerable to
A little background here: I know what a data warehouse is , more or
Should I start with Django or JavaScript?
Should the folders in a solution match the namespace? In one of my teams
Should you set all the objects to null ( Nothing in VB.NET) once you
Should I still be using tables anyway? The table code I'd be replacing is:
Should I use Named Pipes, or .NET Remoting to communicate with a running process

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.