Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 891791
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 15, 20262026-05-15T13:56:43+00:00 2026-05-15T13:56:43+00:00

I have a complex regular expression I’ve built with code. I want to normalize

  • 0

I have a complex regular expression I’ve built with code. I want to normalize it to the simplest (canonical) form that will be an equivalent regular expression but without the extra brackets and so on.

I want it to be normalized so I can understand if it’s correct and find bugs in it.

Here is an example for a regular expression I want to normalize:

^(?:(?:(?:\r\n(?:[ \t]+))*)(<transfer-coding>(?:chunked|(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)(?:(?:;(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)=(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)|(?:"(?:(?:(?:|[^\x00-\x31\x127\"])|(?:\\[\x00-\x127]))*)))))*))))(?:(?:(?:\r\n(?:[ \t]+))*),(?:(?:\r\n(?:[ \t]+))*)(<transfer-coding>(?:chunked|(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)(?:(?:;(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)=(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]+)|(?:"(?:(?:(?:|[^\x00-\x31\x127\"])|(?:\\[\x00-\x127]))*)))))*))))*))$
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-15T13:56:44+00:00Added an answer on May 15, 2026 at 1:56 pm

    I’m with the other answers and comments so far. Even if you could define a reduced form, it’s unlikely that the reduced form is going to be any more understandable than this thing, which resembles line noise on a 1200 baud modem.

    If you did want to find a canonical form for regular expressions, i’d start by defining precisely what you mean by “canonical form”. For example, suppose you have the regular expression [ABCDEF-I]. Is the canonical form (1) [ABCDEF-I], (2) [ABCDEFGHI] or (3) [A-I] ?

    That is, for purposes of canonicalization, do you want to (1) ignore this subset of regular expressions for the purposes of canonicalization, (2) eliminate all “-” operators, thereby simplifying the expression, or (3) make it shorter?

    The simplest way would be to go through every part of the regular expression specification and work out which subexpressions are logically equivalent to another form, and decide which of the two is “more canonical”. Then write a recursive regular expression analyzer that goes through a regular expression and replaces each subexpression with its canonical form. Keep doing that in a loop until you find the “fixed point”, the regular expression that doesn’t change when you put it in canonical form.

    That, however, will not necessarily do what you want. If what you want is to reorganize the regular expression to minimize the complexity of grouping or some such thing then what you might want to do is to canonicalize the regular expression so that it is in a form such that it only has grouping, union and Kleene star operators. Once it is in that form you can easily translate it into a deterministic finite automaton, and once it is in DFA form then you can run a graph simplification algorithm on the DFA to form an equivalent simpler DFA. Then you can turn the resulting simplified DFA back into a regular expression.

    Though that would be fascinating, like I said, I don’t think it would actually solve your problem. Your problem, as I understand it, is a practical one. You have this mess, and you want to understand that it is right.

    I would approach that problem by a completely different tack. If the problem is that the literal string is hard to read, then don’t write it as a literal string. I’d start “simplifying” your regular expression by making it read like a programming language instead of reading like line noise:

    Func<string, string> group = s=>"(?:"+s+")";
    Func<string, string> capture = s=>"("+s+")";
    Func<string, string> anynumberof = s=>s+"*";
    Func<string, string> oneormoreof = s=>s+"+";
    var beginning = "^";
    var end = "$";
    var newline = @"\r\n";
    var tab = @"\t";
    var space = " ";
    var semi = ";";
    var comma = ",";
    var equal = "=";
    var chunked = "chunked";
    var transfer = "<transfer-coding>";
    var backslash = @"\\";
    var escape = group(backslash + @"[\x00-\x7f]");
    var or = "|";
    var whitespace = 
        group(
            anynumberof(
                group(
                    newline +  
                    group(
                        oneormoreof(@"[ \t]")))));
    var legalchars = 
        group(
            oneormoreof(@"[\x21\x23-\x27\x2A\x2B\x2D\x2E0-9A-Z\x5E\x7A\x7C\x7E-\xFE]"));
    
    var re = 
        beginning + 
        group(
            whitespace + 
            capture(
                transfer + 
                group(
                    chunked + 
                    or + 
                    group(
                        legalchars + 
                        group(
                            group(
                                semi + 
                                anynumberof(
                                    group(
                                        legalchars + 
                                        equal +
    

    …

    Once it looks like that it’ll be a lot easier to understand and optimize.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 433k
  • Answers 433k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer I'd say the problem is here: define('SECRET', "vJs;ly-W\XDkD_2'-M7S2/ZRRBobxt5"); // ^^--… May 15, 2026 at 3:02 pm
  • Editorial Team
    Editorial Team added an answer If the path you're referencing is a directory, vim will… May 15, 2026 at 3:02 pm
  • Editorial Team
    Editorial Team added an answer That code won't work at all. As you are setting… May 15, 2026 at 3:02 pm

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.