Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8248839

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T23:20:03+00:00 2026-06-07T23:20:03+00:00

I wrote a C grammar for ParseKit, which does work perfectly, but what drives

  • 0

I wrote a C grammar for ParseKit, which does work perfectly, but what drives me crazy are preprocessor statements. What’s the correct symbol definitions for preprocessor statements?

Here’s the short example of what I’ve tried …

@reportsCommentTokens = YES;
@commentState = '/';
@singleLineComments = '//';
@multiLineComments = '/*' '*/';
@commentState.fallbackState = delimitState;
@delimitState.fallbackState = symbolState;

@start = Empty | comments | preprocessor;

comments = comment*;
comment = Comment;

@symbols = '#include';

preprocessor = preprocessorIncludes;

preprocessorIncludes = preprocessorIncludeStatement*;
preprocessorIncludeStatement = preprocessorInclude quotedFileName*;

preprocessorInclude = '#include';
quotedFileName = QuotedString;

… but it doesn’t work. Take it as simplified grammar example to catch comments and include statement with quotes (not with < >). I tried this grammar on this simple file …

/*
 * Cryptographic API.
 *
 * RIPEMD-256 - RACE Integrity Primitives Evaluation Message Digest.
 *
 * Based on the reference implementation by Antoon Bosselaers, ESAT-COSIC
 *
 * Copyright (c) 2008 Adrian-Ken Rueegsegger <ken@codelabs.ch>
 *
 * This program is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License as published by the Free
 * Software Foundation; either version 2 of the License, or (at your option)
 * any later version.
 *
 */

// Here's one line comment

/* One line multiline comment */

#include "ripemd.h"

/* 2nd one line multiline comment */

… and it ends at /* One line multiline comment */, reports it as comment token and then it silently fails.

So I tried to separate ‘#include’ symbol to …

@symbolState = '#' '#';
@symbol = '#';
numSymbol = '#';

preprocessorInclude = numSymbol 'include';

… but it still doesn’t help.

Maybe Todd can help, but what’s the correct way to handle ‘symbols’ like ‘#include’?

  • 0 0 Answers
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T23:20:06+00:00Added an answer on June 7, 2026 at 11:20 pm

    Developer of ParseKit here.

    Robert, your grammar is very close, but I found that your use of nested * (zero-or-more) modifiers was causing the grammar to fail.

    I think the problem is that your @start grammar production already has Empty as a top-level option (|ed with the other two productions), but then the sub-productions for comments and preprocessor both contain productions with the * (zero-or-more) modifier. Those *s should really be + (one-or-more) modifiers because you have already accounted for the zero case with the top-level Empty.

    I’m not entirely sure, but I don’t think this is a problem unique to ParseKit, but rather, I suspect the grammar was problematic and this issue might have been seen with any such grammar toolkit. (could be wrong)

    With that in mind, some small tweaks to the grammar have fixed it for me. Here’s the edited grammar with the small tweaks:

    @reportsCommentTokens = YES;
    @commentState = '/';
    @singleLineComments = '//';
    @multiLineComments = '/*' '*/';
    @commentState.fallbackState = delimitState;
    @delimitState.fallbackState = symbolState;
    
    @start = (comments | preprocessor)*;
    
    comments = comment+;
    comment = Comment;
    
    @symbols = '#include';
    
    preprocessor = preprocessorIncludes;
    
    preprocessorIncludes = preprocessorIncludeStatement+;
    preprocessorIncludeStatement = preprocessorInclude quotedFileName;
    
    preprocessorInclude = '#include';
    quotedFileName = QuotedString;
    

    Notice my replacement of the Empty in the top-level with a *. And my swapping of the nested *s with +s.

    With this edited grammar, I get the desired output (truncated slightly for clarity):

    [/*
     * Cryptographic API.
    ...
     */, // Here's one line comment, /* One line multiline comment */, #include, "ripemd.h", /* 2nd one line multiline comment */]/*
     * Cryptographic API.
    ...
     *//// Here's one line comment//* One line multiline comment *//#include/"ripemd.h"//* 2nd one line multiline comment */^
    

    Also, to find the issue, I rewrote the grammar to be simpler. It was easier to find the issue that way. Then I re-applied what I found to your original grammar. Here’s the simplified grammar I came up with in case you are interested. This is how I think of this particular grammar in my mind:

    @reportsCommentTokens = YES;
    @commentState = '/';
    @singleLineComments = '//';
    @multiLineComments = '/*' '*/';
    
    @start = (comment | macro)*;
    
    comment = Comment;
    
    macro = include; // to support other macros, add: ` | define | ifdef` etc.
    
    include = '#' 'include' QuotedString;
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

can somebody help me with writing correct grammar rules for nested if statements? In
I’m trying to write a grammar for a language which allows the following expressions:
I'm trying to write some dcg grammar in prolog which will describe language of
I wrote a simple XML file and a DTD file including an entity, but
I'm trying to be able to write an extensible grammar using functions, but can't
I wrote a grammar for a language and now I want to treat some
I wrote a stub for a grammar (only matches comments so far), and it's
I am trying to write a grammar for our custom rule engine which uses
Last year I wrote a Language Service for Visual Studio which added syntax highlighting
I'm trying to write a simple parser for a grammar. The parser does not

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.