Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6074707
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T10:24:44+00:00 2026-05-23T10:24:44+00:00

For an application I’m developing I need a Perl script which loops through a

  • 0

For an application I’m developing I need a Perl script which loops through a massive CSV file and ensures that every single line contains a valid URI. I already asked a question earlier about parsing a CSV file and I have started using Text::CSV to make my life a lot easier. Now I have the issue of ensuring that the URI is valid.

Due to the nature of my application, URIs do not need to take the full form of

protocol://username:password@domain.extension/request?vars=values

Rather I am only interested in the request portion of this. For a general website, that would be anything after the .com, .edu, etc.

I currently have the following Perl script:

if($_ !~ /^(?:[a-z0-9-._~!$&'()*+,;=:/?@]|%[0-9A-F]{2})*$/i){
    print "Invalid URL format";
    exit;
} else {
    /* stuff */
}

The regex should be fairly straight-forward. The request is allowed to contain either one of a small set of symbols ([a-z0-9-._~!$&'()*+,;=:/?@]) or it may contain a percent sign (%) followed by two hexadecimal digits. Either of these patterns may be repeated indefinitely.

When I run this script I get the following error:

Number found where operator expected at ./301rules.pl line 58, near "%[0"
        (Missing operator before 0?)
Bareword found where operator expected at ./301rules.pl line 58, near "9A"
        (Missing operator before A?)
Bareword found where operator expected at ./301rules.pl line 58, near "$/i"
        (Missing operator before i?)
syntax error at ./301rules.pl line 58, near "%[0"

It’s fairly obvious that something in my regex needs to be escaped, however I’m unsure of what. I tried escaping every possible symbol to create the following regex:

if($_ !~ /^(?:[a-z0-9\-\.\_\~\!\$\&\'\(\)\*\+\,\;\=\:\/\?\@]|%[0-9A-F]{2})*$/i){

However when I did this it just allowed every string to pass the test, even strings which I knew are invalid such as te%st or é

So does anyone have experience with Perl regex and know what I need to escape and what I should not escape? With 19 different symbols I don’t feel like trying all 2^19 = 524288 possibilities.

EDIT – voting to close. I found out that the issue actually existed immediately above this loop, although I don’t entirely understand why yet.

I had:

if( $_ == "" ){
    next;
}
/* regex conditional from above */

For whatever reason it kept evaluating to true and going to the next iteration despite there clearly being data stored in $_. I’ll figure out why this was, but for now the regex works fine with everything escaped.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T10:24:44+00:00Added an answer on May 23, 2026 at 10:24 am

    I don’t know how you got to your first regex, but I’ll try helping you fix that. You only have to escape the characters that have special meaning in regex – from your regex, they are: -,.,$,(,),*,/, so the regex should look like:

    if($_ !~ /^(?:[a-z0-9\-\._~!\$&'\(\)\*+,;=:\/?@]|%[0-9A-F]{2})*$/i){
    

    I don’t exactly know what ?: is trying to achieve there, but your first character class that is just following it (the expression between the first [] ) is not having any multipliers – maybe it should be followed by a *, a +, or a ?. Also, the | sign I think is meant to do the or between your first character class and the second character class preceded by a % – as it looks right now, it does it beteween the first character class and the % sign only. It probably should be like |(%[0-9A-F]{2}))*$

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

The application my team is currently developing has a DLL that is used to
Our application has a background thread which spawns a process through System.Diagnostics.Process : Process.Start(
I need to develop a file indexing application in python and wanted to know
Application frameworks such as DotNetNuke, Eclipse, Websphere and so forth are available today which
My application has a need to let the user choose a date from a
Application requests KML data through AJAX from server. This data is stored in javascript
Application I develop requires several data sources (2 RDBMS and one file storage) to
My application uses Google Charts and using HTTPS. I need to display the Google
My application has a static control which inside has a tab control. It looks
My application needs to communicate with a embedded device which is about 1MHz clock

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.