Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1034715
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T14:26:10+00:00 2026-05-16T14:26:10+00:00

The goal Today’s Code Golf challenge is to create a regex parser in as

  • 0

The goal

Today’s Code Golf challenge is to create a regex parser in as few characters as possible.

The syntax

No, I’m not asking you to match Perl-style regular expressions. There’s already a very reliable interpreter for those, after all! 🙂

Here’s all you need to know about regex syntax for this challenge:

  • A term is defined as a single literal character, or a regular expression within grouping parentheses ().
  • The * (asterisk) character represents a Kleene star operation on the previous TERM. This means zero or more of the previous term, concatenated together.
  • The + (plus) character represents a convenient shortcut: a+ is equivalent to aa*, meaning one or more of the previous term.
  • The ? (question mark) character represents zero or one of the previous term.
  • The | (pipe) character represents an alternation, meaning that the REGULAR EXPRESSIONS on either side can be used in the match.
  • All other characters are assumed to be literal. You may assume that all other characters are within [0-9A-Za-z] (i.e., all English alphanumerics).

Or, put another way: */+/? have highest precedence, then concatenation, then alternation. Since alternation has lower precedence than concatenation, its use within a regex without parentheses causes it to be bound to the full regex on each side. * and + and ?, on the other hand, would just apply to the immediately preceding term.

The challenge

Your challenge is to write a program that will compile or interpret a regular expression (as defined above) and then test a number of strings against it.

I’m leaving input up to you. My recommendation would be that the regex should probably come first, and then any number of strings to be tested against it; but if you want to make it last, that’s fine. If you want to put everything in command-line arguments or into stdin, or the regex in command-line and the strings in stdin, or whatever, that’s fine. Just show a usage example or two.

Output should be true or false, one per line, to reflect whether or not the regex matches.

Notes:

  • I shouldn’t need to say this… but don’t use any regex libraries in your language! You need to compile or interpret the pattern yourself. (Edit: You may use regex if it’s required for splitting or joining strings. You just can’t use it to directly solve the problem, e.g., converting the input regex into a language regex and using that.)
  • The regular expression must COMPLETELY match the input string for this challenge. (Equivalently, if you’re familiar with Perl-like regex, assume that start- and end-of-string anchoring is in place for all matches)
  • For this challenge, all of the special characters ()*+?| are not expected to occur literally. If one comes up in the input, it is safe to assume that no pattern can match the string in question.
  • Input strings to test should be evaluated in a case-sensitive manner.

The examples

For the examples, I’m assuming everything is done in command-line arguments, regex first. (As I said above, input is up to you.) myregex here represents your invocation of the program.

> myregex easy easy Easy hard
true
false
false

> myregex ab*a aa abba abab b
true
true
false
false

> myregex 0*1|10 1 10 0110 00001
true
true
false
true

> myregex 0*(1|1+0) 1 10 0110 00001
true
true
true
true

> myregex a?b+|(a+b|b+a?)+ abb babab aaa aabba a b
true
true
false
true
false
true

NOTE: Sorry, forgot to make community wiki! 🙁

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T14:26:10+00:00Added an answer on May 16, 2026 at 2:26 pm

    GolfScript – 254 chars

    n%([]:B:$:_"()"@*{:I"()*+|?"[{}/]?[{[[0B$,+:B))\;)]_]+}{B)):ß;:B;qß(:ß;}{8q}{[[0ß0$,)]]+}:8{[[0B-1=:ß)]]+:$q}{ß>$ß<\([0+$,+]\++}:q{[[I$,:ß)]]+}]=~:$}/;{n+[0]:3\{:c;;3:1_:3;{,}{)[$=]_*2/{~\.{c={3|:3}*;}{;.1|1,\:1,<{+0}*;}if}/}/;}/;1$,?)"true""false"if n}%
    

    Somewhat straightforwardly, the first loop converts the regex into an NFA, which the second loop runs.

    Sun Aug 22 00:58:24 EST 2010 (271→266) changed variable names to remove spaces
    Sun Aug 22 01:07:11 EST 2010 (266→265) made [] a variable
    Sun Aug 22 07:05:50 EST 2010 (265→259) made null state transitions inline
    Sun Aug 22 07:19:21 EST 2010 (259→256) final state made implicit
    Mon Feb 7 19:24:19 EST 2011 (256→254) using "()""str"*

    $ echo "ab*a aa abba abab b"|tr " " "\n"|golfscript regex.gs
    true
    true
    false
    false
    
    $ echo "0*1|10 1 10 0110 00001"|tr " " "\n"|golfscript regex.gs
    true
    true
    false
    true
    
    $ echo "0*(1|1+0) 1 10 0110 00001"|tr " " "\n"|golfscript regex.gs
    true
    true
    true
    true
    
    $ echo "a?b+|(a+b|b+a?)+ abb babab aaa aabba a b"|tr " " "\n"|golfscript regex.gs
    true
    true
    false
    true
    false
    true
    
    $ echo "((A|B|C)+(a|(bbbbb)|bb|c)+)+ ABCABCaccabbbbbaACBbbb ABCABCaccabbbbbaACBbbbb"|tr " " "\n"|golfscript regex.gs
    false
    true
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

No related questions found

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.