I need to create a lexer/parser which deals with input data of variable length

Question

0

Editorial Team

Asked: May 28, 20262026-05-28T03:37:55+00:00 2026-05-28T03:37:55+00:00

I need to create a lexer/parser which deals with input data of variable length

0

I need to create a lexer/parser which deals with input data of variable length and structure.

Say I have a list of reserved keywords:

keyWordList = ['command1', 'command2', 'command3']

and a user input string:

userInput = 'The quick brown command1 fox jumped over command2 the lazy dog command 3'
userInputList = userInput.split()

How would I go about writing this function:

INPUT:

tokenize(userInputList, keyWordList)

OUTPUT:
[['The', 'quick', 'brown'], 'command1', ['fox', 'jumped', 'over'], 'command 2', ['the', 'lazy', 'dog'], 'command3']

I’ve written a tokenizer that can identify keywords, but have been unable to figure out an efficent way to embed groups of non-keywords into lists that are a level deeper.

RE solutions are welcome, but I would really like to see the underlying algorithm as I am probably going to extend the application to lists of other objects and not just strings.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T03:37:56+00:00

Try this:

keyWordList = ['command1', 'command2', 'command3']
userInput = 'The quick brown command1 fox jumped over command2 the lazy dog command3'
inputList = userInput.split()

def tokenize(userInputList, keyWordList):
    keywords = set(keyWordList)
    tokens, acc = [], []
    for e in userInputList:
        if e in keywords:
            tokens.append(acc)
            tokens.append(e)
            acc = []
        else:
            acc.append(e)
    if acc:
        tokens.append(acc)
    return tokens

tokenize(inputList, keyWordList)
> [['The', 'quick', 'brown'], 'command1', ['fox', 'jumped', 'over'], 'command2', ['the', 'lazy', 'dog'], 'command3']

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need to create a lexer/parser which deals with input data of variable length

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply