Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7740743
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T08:51:29+00:00 2026-06-01T08:51:29+00:00

I am working on a simple SQL select like query parser and I need

  • 0

I am working on a simple SQL select like query parser and I need to be able to capture subqueries that can occur at certain places literally. I found lexer states are the best solution and was able to do a POC using curly braces to mark the start and end. However, the subqueries will be delimited by parenthesis, not curlys, and the parenthesis can occur at other places as well, so I can’t being the state with every open-paren. This information is readily available with the parser, so I was hoping to call begin and end at appropriate locations in the parser rules. This however didn’t work because lexer seem to tokenize the stream all at once, and so the tokens get generated in the INITIAL state. Is there a workaround for this problem? Here is an outline of what I tried to do:

def p_value_subquery(p):
    """
     value : start_sub end_sub
    """
    p[0] = "( " + p[1] + " )"

def p_start_sub(p):
    """
    start_sub : OPAR
    """
    start_subquery(p.lexer)
    p[0] = p[1]

def p_end_sub(p):
    """
    end_sub : CPAR
    """
    subquery = end_subquery(p.lexer)
    p[0] = subquery

The start_subquery() and end_subquery() are defined like this:

def start_subquery(lexer):
    lexer.code_start = lexer.lexpos        # Record the starting position
    lexer.level = 1
    lexer.begin('subquery') 

def end_subquery(lexer):
    value = lexer.lexdata[lexer.code_start:lexer.lexpos-1]
    lexer.lineno += value.count('\n')
    lexer.begin('INITIAL')
    return value

The lexer tokens are simply there to detect the close-paren:

@lex.TOKEN(r"\(")
def t_subquery_SUBQST(t):
    lexer.level += 1

@lex.TOKEN(r"\)")
def t_subquery_SUBQEN(t):
    lexer.level -= 1

@lex.TOKEN(r".")
def t_subquery_anychar(t):
    pass

I would appreciate any help.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T08:51:30+00:00Added an answer on June 1, 2026 at 8:51 am

    Based on PLY author’s response, I came up with this better solution. I am yet to figure out how to return the subquery as a token, but the rest looks much better and need not be considered a hack anymore.

    def start_subquery(lexer):
        lexer.code_start = lexer.lexpos        # Record the starting position
        lexer.level = 1
        lexer.begin("subquery")
    
    def end_subquery(lexer):
        lexer.begin("INITIAL")
    
    def get_subquery(lexer):
        value = lexer.lexdata[lexer.code_start:lexer.code_end-1]
        lexer.lineno += value.count('\n')
        return value
    
    @lex.TOKEN(r"\(")
    def t_subquery_OPAR(t):
        lexer.level += 1
    
    @lex.TOKEN(r"\)")
    def t_subquery_CPAR(t):
        lexer.level -= 1
        if lexer.level == 0:
            lexer.code_end = lexer.lexpos        # Record the ending position
            return t
    
    @lex.TOKEN(r".")
    def t_subquery_anychar(t):
        pass
    
    def p_value_subquery(p):
        """
        value : check_subquery_start OPAR check_subquery_end CPAR
        """
        p[0] = "( " + get_subquery(p.lexer) + " )"
    
    def p_check_subquery_start(p):
        """
        check_subquery_start : 
        """
        # Here last_token would be yacc's lookahead.
        if last_token.type == "OPAR":
            start_subquery(p.lexer)
    
    def p_check_subquery_end(p):
        """
        check_subquery_end : 
        """
        # Here last_token would be yacc's lookahead.
        if last_token.type == "CPAR":
            end_subquery(p.lexer)
    
    last_token = None
    
    def p_error(p):
        global subquery_retry_pos
        if p is None:
            print >> sys.stderr, "ERROR: unexpected end of query"
        else:
            print >> sys.stderr, "ERROR: Skipping unrecognized token", p.type, "("+ \
                    p.value+") at line:", p.lineno, "and column:", find_column(p.lexer.lexdata, p)
            # Just discard the token and tell the parser it's okay.
            yacc.errok()
    
    def get_token():
        global last_token
        last_token = lexer.token()
        return last_token
    
    def parse_query(input, debug=0):
        lexer.input(input)
        return parser.parse(input, tokenfunc=get_token, debug=0)
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a sql function that does a simple sql select statement: CREATE OR
I have a simple query like this SELECT * FROM MY_TABLE; When I run
I have a working query that will return some results(records) from my database, like:
I'm working on SQL server 2005 and I have a very simple stored procedure:
The simple working query, list the number of licenses for each license type: Query:
Working on a simple TimeTracker App. You start a timer, wait a certain time,
im working on a simple game like space invaders,and i got into a problem.
I have a t-sql query written with this sample help . SELECT t.gName AS
I'm constructing a SQL query for a business report. I need to have both
I have a Rails 3 simple scaffolded application having one model like that: class

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.