Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 68025
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 10, 20262026-05-10T19:21:07+00:00 2026-05-10T19:21:07+00:00

Thinking about my other problem , i decided I can’t even create a regular

  • 0

Thinking about my other problem, i decided I can’t even create a regular expression that will match roman numerals (let alone a context-free grammar that will generate them)

The problem is matching only valid roman numerals. Eg, 990 is NOT ‘XM’, it’s ‘CMXC’

My problem in making the regex for this is that in order to allow or not allow certain characters, I need to look back. Let’s take thousands and hundreds, for example.

I can allow M{0,2}C?M (to allow for 900, 1000, 1900, 2000, 2900 and 3000). However, If the match is on CM, I can’t allow following characters to be C or D (because I’m already at 900).

How can I express this in a regex?
If it’s simply not expressible in a regex, is it expressible in a context-free grammar?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-10T19:21:07+00:00Added an answer on May 10, 2026 at 7:21 pm

    You can use the following regex for this:

    ^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$ 

    Breaking it down, M{0,4} specifies the thousands section and basically restrains it to between 0 and 4000. It’s a relatively simple:

       0: <empty>  matched by M{0} 1000: M        matched by M{1} 2000: MM       matched by M{2} 3000: MMM      matched by M{3} 4000: MMMM     matched by M{4} 

    You could, of course, use something like M* to allow any number (including zero) of thousands, if you want to allow bigger numbers.

    Next is (CM|CD|D?C{0,3}), slightly more complex, this is for the hundreds section and covers all the possibilities:

      0: <empty>  matched by D?C{0} (with D not there) 100: C        matched by D?C{1} (with D not there) 200: CC       matched by D?C{2} (with D not there) 300: CCC      matched by D?C{3} (with D not there) 400: CD       matched by CD 500: D        matched by D?C{0} (with D there) 600: DC       matched by D?C{1} (with D there) 700: DCC      matched by D?C{2} (with D there) 800: DCCC     matched by D?C{3} (with D there) 900: CM       matched by CM 

    Thirdly, (XC|XL|L?X{0,3}) follows the same rules as previous section but for the tens place:

     0: <empty>  matched by L?X{0} (with L not there) 10: X        matched by L?X{1} (with L not there) 20: XX       matched by L?X{2} (with L not there) 30: XXX      matched by L?X{3} (with L not there) 40: XL       matched by XL 50: L        matched by L?X{0} (with L there) 60: LX       matched by L?X{1} (with L there) 70: LXX      matched by L?X{2} (with L there) 80: LXXX     matched by L?X{3} (with L there) 90: XC       matched by XC 

    And, finally, (IX|IV|V?I{0,3}) is the units section, handling 0 through 9 and also similar to the previous two sections (Roman numerals, despite their seeming weirdness, follow some logical rules once you figure out what they are):

    0: <empty>  matched by V?I{0} (with V not there) 1: I        matched by V?I{1} (with V not there) 2: II       matched by V?I{2} (with V not there) 3: III      matched by V?I{3} (with V not there) 4: IV       matched by IV 5: V        matched by V?I{0} (with V there) 6: VI       matched by V?I{1} (with V there) 7: VII      matched by V?I{2} (with V there) 8: VIII     matched by V?I{3} (with V there) 9: IX       matched by IX 

    Just keep in mind that that regex will also match an empty string. If you don’t want this (and your regex engine is modern enough), you can use positive look-ahead:

    ^(?=.)M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$ 

    This is a "check to match but discard" operation, meaning it looks ahead to check that the first character exists (.) after the start marker (^) but doesn’t absorb that first character. For example, if the string was M, that would match the . but still be available for the next section of the regex, M{0,4}. However, the empty string would not match the look-ahead so would fail.

    Another alternative, if you are not restricted to just a regex, would be to check that the length is not zero beforehand.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 76k
  • Answers 76k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • added an answer I suggest not to use the JSP technology anymore unless… May 11, 2026 at 3:06 pm
  • added an answer JPEG has a special encoding mode called 'Progressive JPEG' in… May 11, 2026 at 3:06 pm
  • added an answer This is really a discussion about surrogate (also called technical… May 11, 2026 at 3:06 pm

Related Questions

Thinking about my other problem , i decided I can't even create a regular
Question might be tricky (because of its nature or my way of describing it),
After reading some threads on misuses of exceptions (basically saying you don't want to
I have a chicken-egg problem. I would like too implement a system in PHP
I am playing with the printf and the idea to write a my_printf(...) that

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.