Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7041471
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T02:03:00+00:00 2026-05-28T02:03:00+00:00

The book ‘Modern Compiler Design’ is the nice book about compilers. In its source

  • 0

The book ‘Modern Compiler Design’ is the nice book about compilers. In its source code something that is annoying me is AST or Abstract Syntax Tree. Suppose we want to write a parenthesized expression parser which parses something like: ((2+3)*4) * 2! The book says that we have an AST like:

        ((2+3)*4) * 2
          /   |     \
       (2+3)  *4    * 2
        /     | \
     (2+3)    *  4
     / | \
    2  + 3

So should I save a tree in memory or just use recursive calls; Note: if I don’t store it in memory, how can I convert it to machine code ?

Parser code:

int parse(Expression &expr)
{
  if(token.class=='D')
  { 
    expr.type='D';
    expr.value=token.val-'0';
    get_next_token();
    return 1;
  }
  if(token.class=='(') 
  {
    expr.type='P';
    get_next_token();
    parse(&expr->left);
    parse_operator(&expr->op);
    parse(&expr->right);
    if(token.class!=')')
      Error("missing )");
    get_next_token();
    return 1;
  }
  return 0;
}

Grammar is:

expr -> expr | (expr op expr)
digit   -> 0|1|2....|9
op  -> +|*
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T02:03:01+00:00Added an answer on May 28, 2026 at 2:03 am

    You can store the tree in memory or you can directly produce the required output code. Storing the intermediate form is normally done to be able to do some processing on the code at an higher level before generating output.

    In your case for example it would be simple to discover that your expression contains no variables and therefore the result is a fixed number. Looking only at one node at a time this however is not possible. To be more explicit if after looking at “2*” you generate machine code for computing the double of something this code is sort of wasted when the other part is for example “3” because your program will compute “3” and then compute the double of that every time while just loading “6” would be equivalent but shorter and faster.

    If you want to generate the machine code then you need first to know for what kind of machine the code is going to be generated… the simplest model uses a stack-based approach. In this case you need no register allocation logic and it’s easy to compile directly to machine code without the intermediate representation. Consider this small example that handles just integers, four operations, unary negation and variables… you will notice that no data structure is used at all: source code characters are read and machine instructions are written to output…

    #include <stdio.h>
    #include <stdlib.h>
    
    void error(const char *what) {
        fprintf(stderr, "ERROR: %s\n", what);
        exit(1);
    }
    
    void compileLiteral(const char *& s) {
        int v = 0;
        while (*s >= '0' && *s <= '9') {
            v = v*10 + *s++ - '0';
        }
        printf("    mov  eax, %i\n", v);
    }
    
    void compileSymbol(const char *& s) {
        printf("    mov  eax, dword ptr ");
        while ((*s >= 'a' && *s <= 'z') ||
               (*s >= 'A' && *s <= 'Z') ||
               (*s >= '0' && *s <= '9') ||
               (*s == '_')) {
            putchar(*s++);
        }
        printf("\n");
    }
    
    void compileExpression(const char *&);
    
    void compileTerm(const char *& s) {
        if (*s >= '0' && *s <= '9') {
            // Number
            compileLiteral(s);
        } else if ((*s >= 'a' && *s <= 'z') ||
                   (*s >= 'A' && *s <= 'Z') ||
                   (*s == '_')) {
            // Variable
            compileSymbol(s);
        } else if (*s == '-') {
            // Unary negation
            s++;
            compileTerm(s);
            printf("    neg  eax\n");
        } else if (*s == '(') {
            // Parenthesized sub-expression
            s++;
            compileExpression(s);
            if (*s != ')')
                error("')' expected");
            s++;
        } else {
            error("Syntax error");
        }
    }
    
    void compileMulDiv(const char *& s) {
        compileTerm(s);
        for (;;) {
            if (*s == '*') {
                s++;
                printf("    push eax\n");
                compileTerm(s);
                printf("    mov  ebx, eax\n");
                printf("    pop  eax\n");
                printf("    imul ebx\n");
            } else if (*s == '/') {
                s++;
                printf("    push eax\n");
                compileTerm(s);
                printf("    mov  ebx, eax\n");
                printf("    pop  eax\n");
                printf("    idiv ebx\n");
            } else break;
        }
    }
    
    void compileAddSub(const char *& s) {
        compileMulDiv(s);
        for (;;) {
            if (*s == '+') {
                s++;
                printf("    push eax\n");
                compileMulDiv(s);
                printf("    mov  ebx, eax\n");
                printf("    pop  eax\n");
                printf("    add  eax, ebx\n");
            } else if (*s == '-') {
                s++;
                printf("    push eax\n");
                compileMulDiv(s);
                printf("    mov  ebx, eax\n");
                printf("    pop  eax\n");
                printf("    sub  eax, ebx\n");
            } else break;
        }
    }
    
    void compileExpression(const char *& s) {
        compileAddSub(s);
    }
    
    int main(int argc, const char *argv[]) {
        if (argc != 2) error("Syntax: simple-compiler <expr>\n");
        compileExpression(argv[1]);
        return 0;
    }
    

    For example running the compiler with 1+y*(-3+x) as input you get as output

    mov  eax, 1
    push eax
    mov  eax, dword ptr y
    push eax
    mov  eax, 3
    neg  eax
    push eax
    mov  eax, dword ptr x
    mov  ebx, eax
    pop  eax
    add  eax, ebx
    mov  ebx, eax
    pop  eax
    imul ebx
    mov  ebx, eax
    pop  eax
    add  eax, ebx
    

    However this approach of writing compilers doesn’t scale well to an optimizing compiler.

    While it’s possible to get some optimization by adding a “peephole” optimizer in the output stage, many useful optimizations are possible only looking at code from an higher point of view.

    Also even the bare machine code generation could benefit by seeing more code, for example to decide which register assign to what or to decide which of the possible assembler implementations would be convenient for a specific code pattern.

    For example the same expression could be compiled by an optimizing compiler to

    mov  eax, dword ptr x
    sub  eax, 3
    imul dword ptr y
    inc  eax
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

The book, Producing Open Source Software , suggests that it's better to have lengthy
After reading a book on LINQ I'm thinking about re-writing a mapper class that
I read in a book that CreateInstance is a MUST for a source filter
The book that I purchased to help with my SSIS understanding seems to have
A) Book I’m learning from says that if we handle Login.Authenticate event, then we
Q1 Book suggests that before we register new SqlProfileProvider , we should remove any
The book says about a small Windows.Forms program The Windows Forms classes are in
Anyone know a good book or post about how to start in EF? I
Almost every Java book I read talks about using the interface as a way
Book I’ learning from claims that intArray has two dimensions. But since calling intArray.GetLength(1)

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.