I am working on splitting a source file into tokens, particularly scanning for identifiers.

Question

0

Asked: May 29, 20262026-05-29T08:23:58+00:00 2026-05-29T08:23:58+00:00

I am working on splitting a source file into tokens, particularly scanning for identifiers.

0

I am working on splitting a source file into tokens, particularly scanning for identifiers. However, there is a requirement that identifiers be AT MOST 30 characters long. When an identifier reaches this length, I raise an exception with the message: 'Identifiers can only be 30 characters long, truncating..'.

This is how it should be, but when I raise this exception I jump out my method that scans for identifiers before I am able to store it. I need to somehow raise the exception AND keep the identifier that I have collected so far. Any ideas as to how this could be done?

# classify each character, and call approriate scan methods
def tokenize()
  @infile.each_char do |c|
    begin
      case c
      when /[a-zA-Z\$]/
        scan_identifier(c)
      when /\s/ 
        #ignore spaces
      else
        #do nothing
      end
    rescue TokenizerError => te
      puts "#{te.class}: #{te.message}"
    end
  end
end

# Reads an identifier from the source program
def scan_identifier(id)
  this_id = id #initialize this identifier with the character read above

  @infile.each_char do |c|
    if c =~ /[a-zA-Z0-9_]/
      this_id += c 
      # raising this exception leaves this function before collecting the 
      # truncated identifier
      raise TokenizerError, 'Identifiers can only be 30 characters long, truncating..' if this_id.length == 30
    else 
      puts "#{this_id}"
      break # not part of the identifier, or an error
    end
  end
end

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-29T08:23:59+00:00

This is an abuse of exceptions, IMO, because this is not an exceptional case. Instead, consider simply logging something:

    if c =~ /[a-zA-Z0-9_]/
      warn "Identifer was too long and was truncated"
      this_id += c

If you must use the exception for some reason, then the most straightforward way is just to put this_id in an instance variable instead:

@this_identifier = id
# ...

Then, when you break in the rescue, just have the last expression be @this_identifier to return that value (yuck).

Bonus comment: this is a truly wretched way to parse source files. You should be using something like RubyParser if you’re parsing Ruby, or Treetop if you’re parsing something else.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am working on splitting a source file into tokens, particularly scanning for identifiers.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply