I have written simple lexical analyzer. And I understand the need to provide each

Question

0

Asked: May 19, 20262026-05-19T05:08:09+00:00 2026-05-19T05:08:09+00:00

I have written simple lexical analyzer. And I understand the need to provide each

0

I have written simple lexical analyzer. And I understand the need to provide each recognized token with attribute. Let’s see what I got:

public sealed class Token
{ 
    public enum TokenClass
    { 
        Identifier,
        StringLiteral,
        NumberLiteral,
        Operator,
        PunctuationSeparator,
        Bracket,
        Parenthesis
    }        
    public TokenClass Class { get; internal set; }
    public String     Value { get; internal set; }
}

In lexer I enqueue tokens setting up thier value & class. But what about attributes? How should I design the feature relative to my existing token class?

First tought came into my mind was:

Declare private abstract classes of “ambiguous-entities” (I mean that Number could be Integer and Real and so on) inside token class;
Then declare inherited classes e.g.
public class Comma : PunctuationSeparator {};
Add Property Object Attribute {get; private set;};
Then create method like private void ApplyAttribute();
Call ApplyAttribute() when token is instantiated and properties are set;

Use something like this inside ApplyAttribute().

switch(this.TokenClass)
{
case this.TokenClass.Number:
    {
        this.Attribute = (Int32.TryParse(this.Value))? new Integer() : new Real();                
    }
}

In parser it would be easy to write something like that if(CurToken.Attribute is Integer).
One thing that stops me from doing like that is number of classes I should create. Is this solution acceptable?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-19T05:08:09+00:00

The attributes I’d use for a token? Probably something along the lines of

public class Token
{
  public TokenType Type { get ; private set ; }
  public string    Text { get ; private set ; }
  public int       LineNumber { get ; private set ; }
  public int       Column     { get ; private set ; }
}

public enum TokenType
{
  Keyword : 1 ,
  Integer ,
  String  ,
  Whitespace ,
  Comment ,
  ... 
}

I disagree, though, with the previous poster regarding conversion of the token’s text into a ‘value’. IMHO, that is the domain of the parser and the nodes of the parse tree. Until the parser has placed the tokens in context WRT the grammar, the token is just a piece of text with a label attached to it. The lexical analyzer doesn’t know (and should care) what’s happening downstream — for all it know, the took is pretty-printing the source text (in which case, you want to leave the individual tokens alone).

You might want to take a look at Terrance Parr’s book(s):

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have written simple lexical analyzer. And I understand the need to provide each

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply