Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 150797
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 11, 20262026-05-11T09:20:50+00:00 2026-05-11T09:20:50+00:00

Given this text: /* F004 (0309)00 */ /* field 1 */ /* field 2

  • 0

Given this text:

     /* F004 (0309)00 */       /* field 1 */       /* field 2 */       /* F004 (0409)00 */       /* field 1 */       /* field 2 */   

how do I parse it into this array:
[
['F004'],['0309'],['/* field 1 */\n/* field 2 */'],
['F004'],['0409'],['/* field 1 */\n/* field 2 */']
]

I got code working to parse the first two items:

form = /\/\*\s+(\w+)\s+\((\d{4})\)[0]{2}\s+\*\//m text.scan(form) 

[
['F004'],['0309'],
['F004'],['0409']
]

And here’s the code where I try to parse all three and fail w/ an invalid regex error:

form = /\/\*\s+(\w+)\s+\((\d{4})\)[0]{2}\s+\*\//m form_and_fields = /#{form}(.[^#{form}]+)/m text.scan(form_and_fields) 

edit: This is what ended up working for me, thanks to both rampion, & singpolyma:

form = /   \/\*\s+(\w+)\s+\((\d+)\)\d+\s+\*\/    #formId & edDate   (.+?)                                 #fieldText   (?=\/\*\s+\w+\s+\(\d+\)\d+\s+\*\/|\Z) #stop at beginning of next form                                         # or the end of the string /mx text.scan(form) 
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. 2026-05-11T09:20:51+00:00Added an answer on May 11, 2026 at 9:20 am

    You seem to be misunderstanding how character classes (e.g. [a-f0-9], or [^aeiouy]) work. /[^abcd]/ doesn’t negate the pattern abcd, it says ‘match any character that’s not 'a' or 'b' or 'c' or 'd'‘.

    If you want to match the negation of a pattern, use the /(?!pattern)/ construct. It’s a zero-width match – meaning it doesn’t actually match any characters, it matches a position. Similar to how /^/ and /$/ match the start and end of a string, or /\b/ matches the boundary of a word. For instance: /(?!xx)/ matches every position where the pattern ‘xx’ doesn’t start.

    In general then, after you use a pattern negation, you need to match some character to move forward in the string.

    So to use your pattern:

    form = /\/\*\s+(\w+)\s+\((\d{4})\)[0]{2}\s+\*\//m form_and_fields = /#{form}((?:(?!#{form}).)+)/m text.scan(form_and_fields) 

    From the inside out (I’ll be using (?#comments))

    • (?!#{form}) negates your original pattern, so it matches any position where your original pattern can’t start.
    • (?:(?!#{form}).)+ means match one character after that, and try again, as many times as possible, but at least once. (?:(?#whatever)) is a non-capturing parentheses – good for grouping.

    In irb, this gives:

    irb> text.scan(form_and_fields) => [['F004', '0309', '  \n    /* field 1 */  \n    /* field 2 */  \n    ', nil, nil], ['F004', '0409', '  \n    /* field 1 */  \n    /* field 2 */  \n', nil, nil]] 

    The extra nils come from the capturing groups in form that are used in the negated pattern (?!#{form}) and therefore don’t capture anything on a successful match.

    This could be cleaned up some:

    form_and_fields = /#{form}\s*(.+?)\s*(?:(?=#{form})|\Z)/m text.scan(form_and_fields) 

    Now, instead of a zero-width negative lookahead, we use a zero-width positive lookahead (?=#{form}) to match the position of the next occurrence of form. So in this regex, we match everything until the next occurence of form (without including that next occurence in our match). This lets us trim out some whitespace around the fields. We also have to check for the case where we hit the end of the string – /\Z/, since that could happen too.

    In irb:

    irb> text.scan(form_and_fields) => [['F004', '0309', '/* field 1 */  \n    /* field 2 */', 'F004', '0409'], ['F004', '0409', '/* field 1 */  \n    /* field 2 */', nil, nil]] 

    Note now that the last two fields are populated the first time – b/c the capturing parens in the zero-width positive lookahead matched something, even though it wasn’t marked as ‘consumed’ during the process – which is why that bit could be rematched for the second time.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Ask A Question

Stats

  • Questions 173k
  • Answers 173k
  • Best Answers 0
  • User 1
  • Popular
  • Answers
  • Editorial Team

    How to approach applying for a job at a company ...

    • 7 Answers
  • Editorial Team

    What is a programmer’s life like?

    • 5 Answers
  • Editorial Team

    How to handle personal stress caused by utterly incompetent and ...

    • 5 Answers
  • Editorial Team
    Editorial Team added an answer A quick, easy way to deal with that is by… May 12, 2026 at 2:44 pm
  • Editorial Team
    Editorial Team added an answer In classical regular expressions, this is impossible - DFAs can't… May 12, 2026 at 2:44 pm
  • Editorial Team
    Editorial Team added an answer Why not just create a Windows Installer package? See Demystifying… May 12, 2026 at 2:44 pm

Related Questions

Given this text Foo(Bar) I'd like to extract Bar using a regex. Help!
This regular expression: <IMG\s([^'>]+|'[^']*'|[^]*)+> seems to process endlessly when given this text <img src=http://www.blahblahblah.com/houses/Images/
EDIT: Link should work now, sorry for the trouble I have a text file
For the given text: This text A,is separated,by a comma A,unpreceded by the uppercase

Trending Tags

analytics british company computer developers django employee employer english facebook french google interview javascript language life php programmer programs salary

Top Members

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.