I need to count the number of classes in correct C# source file. I

Question

0

Asked: May 19, 20262026-05-19T22:06:11+00:00 2026-05-19T22:06:11+00:00

I need to count the number of classes in correct C# source file. I

0

I need to count the number of classes in correct C# source file.
I wrote the following grammar:

grammar CSharpClassGrammar;

options
{
        language=CSharp2;

}

@parser::namespace { CSharpClassGrammar.Generated }
@lexer::namespace  { CSharpClassGrammar.Generated }

@header
{
        using System;
        using System.Collections.Generic;

}

@members
{
        private List<string> _classCollector = new List<string>();
        public List<string> ClassCollector { get { return
_classCollector; } }

}

/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/

csfile  : class_declaration* EOF
        ;

class_declaration
        : (ACCESSLEVEL | MODIFIERS)* PARTIAL? 'class' CLASSNAME
          class_body
          ';'?
          { _classCollector.Add($CLASSNAME.text); }
        ;

class_body
        : '{' class_declaration* '}'
        ;

/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/

ACCESSLEVEL
        : 'public' | 'internal' | 'protected' | 'private' | 'protected
internal'
        ;

MODIFIERS
        : 'static' | 'sealed' | 'abstract'
        ;

PARTIAL
        : 'partial'
        ;

CLASSNAME
        : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
        ;

COMMENT
        : '//' ~('\n'|'\r')* {$channel=HIDDEN;}
        |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
        ;

WHITESPACE
        : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+ { $channel = HIDDEN; }
        ;

This parser correctly count empty classes (and nested classes too) with empty class-body:

internal class DeclarationClass1
{
    class DeclarationClass2
    {
        public class DeclarationClass3
        {
            abstract class DeclarationClass4
            {
            }
        }
    }
}

I need to count classes with not empty body, such as:

class TestClass
{
    int a = 42;

    class Nested { }
}

I need to somehow ignore all the code that is “not a class declaration”.
In the example above ignore

int a = 42;

How can I do this? May be example for other language?
Please, help!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-19T22:06:11+00:00

When you’re only interested in certain parts of a source file, you could set filter=true in your options { … } sections. This will enable you to only define those tokens you’re interested in, and what you don’t define, is ignored by the lexer.

Note that this only works with lexer grammars, not in combined (or parser) grammars.

A little demo:

lexer grammar CSharpClassLexer;

options {
  language=CSharp2;
  filter=true;
}

@namespace { Demo }

Comment
  :  '//' ~('\r' | '\n')*
  |  '/*' .* '*/'
  ;

String
  :  '"' ('\\' . | ~('"' | '\\' | '\r' | '\n'))* '"'
  |  '@' '"' ('"' '"' | ~'"')* '"'
  ;

Class
  :  'class' Space+ Identifier 
     {Console.WriteLine("Found class: " + $Identifier.text);}
  ;

Space
  :  ' ' | '\t' | '\r' | '\n'
  ;

Identifier
  :  ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*
  ;

It’s important you leave the Identifier in there because you don’t want Xclass Foo to be tokenized as: ['X', 'class', 'Foo']. With the Identifier in there, Xclass will become the entire identifier.

The grammar can be tested with the following class:

using System;
using Antlr.Runtime;

namespace Demo
{
    class MainClass
    {
        public static void Main (string[] args)
        {
            string source = 
@"class TestClass
{
    int a = 42;

    string _class = ""inside a string literal: class FooBar {}..."";

    class Nested { 
        /* class NotAClass {} */

        // class X { }

        class DoubleNested {
            string str = @""
                multi line string 
                class Bar {}
            "";
        }
    }
}";
            Console.WriteLine("source=\n" + source + "\n-------------------------");
            ANTLRStringStream Input = new ANTLRStringStream(source);
            CSharpClassLexer Lexer = new CSharpClassLexer(Input);
            CommonTokenStream Tokens = new CommonTokenStream(Lexer);
            Tokens.GetTokens();
        }
    }
}

which produces the following output:

source=
class TestClass
{
    int a = 42;

    string _class = "inside a string literal: class FooBar {}...";

    class Nested { 
        /* class NotAClass {} */

        // class X { }

        class DoubleNested {
            string str = @"
                multi line string 
                class Bar {}
            ";
        }
    }
}
-------------------------
Found class: TestClass
Found class: Nested
Found class: DoubleNested

Note that this is just a quick demo, I am not sure if I handled the proper string literals in the grammar (I am unfamiliar with C#), but this demo should give you a start.

Good luck!

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to count the number of classes in correct C# source file. I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply