All line breaks are normalized to spaces, so just put…

Question

0

Asked: May 11, 20262026-05-11T19:46:01+00:00 2026-05-11T19:46:01+00:00

I’m attempting to write an application to extract properties and code from proprietary IDE

0

I’m attempting to write an application to extract properties and code from proprietary IDE design files. The file format looks something like this:

HEADING
{
  SUBHEADING1
  {
    PropName1 = PropVal1;
    PropName2 = PropVal2;
  }

  SUBHEADING2
  {
    { 1 ; PropVal1 ; PropValue2 }
    { 2 ; PropVal1 ; PropValue2 ; OnEvent1=BEGIN
                                             MESSAGE('Hello, World!');
                                             { block comments are between braces }
                                             //inline comments are after double-slashes
                                           END; 
    PropVal3 }
    { 1 ; PropVal1 ; PropVal2; PropVal3 }
  }
}

What I am trying to do is extract the contents under the subheading blocks. In the case of SUBHEADING2, I would also separate each token as delimited by the semicolons. I had reasonably good success with just counting the brackets and keeping track of what subheading I’m currently under. The main issue I encountered involves dealing with the code comments.

This language happens to use {} for block comments, which interferes with the brackets in the file format. To make it even more interesting, it also needs to take into account double-slash inline comments and ignore everything up to the end of the line.

What is the best approach to tackling this? I looked at some of the compiler libraries discussed in another article (ANTLR, Doxygen, etc.) but they seem like overkill for solving this specific parsing issue.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T19:46:01+00:00

You should be able to put something together in a few hours, using regular expressions in combination with some code that uses the results.

Something like this should work:
– Initialize the process by loading the file into a string.

Pull each top-level block from the string, using regex tags to separately identify the block keyword and contents.
If a block is found,
- Make a decision based on the keyword
- Pass the content to this process recursively.

Following this, you would process HEADING, then the first SUBHEADING, then the second SUBHEADING, then each sub-block. For the sub-block containing the block comment, you would presumably know based on the block’s lack of a keyword that any sub-block is a comment, so there is no need to process the sub-blocks.

How to approach applying for a job at a company ...

How to handle personal stress caused by utterly incompetent and ...

What is a programmer’s life like?

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions