Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6024133
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 23, 20262026-05-23T04:06:38+00:00 2026-05-23T04:06:38+00:00

I am writing a compiler for a small imperative language. The target language is

  • 0

I am writing a compiler for a small imperative language. The target language is Java bytecode, and the compiler is implemented in Haskell.

I’ve written a frontend for the language – i.e I have a lexer, parser and typechecker. I’m having trouble figuring out how to do code generation.

I keep a data structure representing the stack of local variables. I can query this structure with the name of a local variable and get its position in the stack. This data structure is passed around as I walk the syntax tree, and variables are popped and pushed as I enter and exit new scopes.

What I having trouble figuring out is how to emit the bytecode. Emitting strings at terminals and concatenating them at higher levels seems like a poor solution, both clarity- and performance-wise.

tl;dr How do I emit bytecode while waling the syntax tree?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-23T04:06:39+00:00Added an answer on May 23, 2026 at 4:06 am

    My first project in Haskell a few months back was to write a c compiler, and what resulted was a fairly naive approach to code generation, which I’ll walk through here. Please do not take this as an example of good design for a code generator, but rather view it as a quick and dirty (and ultimately naive) way to get something that works fairly quickly with decent performance.

    I began by defining an intermediate representation LIR (Lower Intermediate Representation) which closely corresponded to my instruction set (x86_64 in my case):

    data LIRInst = LIRRegAssignInst LIRReg LIRExpr
                 | LIRRegOffAssignInst LIRReg LIRReg LIRSize LIROperand
                 | LIRStoreInst LIRMemAddr LIROperand
                 | LIRLoadInst LIRReg LIRMemAddr
                 | LIREnterInst LIRInt
                 | LIRJumpLabelInst LIRLabel
                 | LIRIfInst LIRRelExpr LIRLabel LIRLabel -- false, then true
                 | LIRCallInst LIRLabel LIRLabel -- method label, return label
                 | LIRCalloutInst String
                 | LIRRetInst [LIRLabel] String -- list of successors, and the name of the method returning from
                 | LIRLabelInst LIRLabel
                 deriving (Show, Eq, Typeable)
    

    Next up came a monad that would handle interleaving state throughout the translation (I was blissfully unaware of our friend-the State Monad-at the time):

    newtype LIRTranslator a = LIRTranslator
        { runLIR :: Namespace -> (a, Namespace) }
    
    instance Monad LIRTranslator where
        return a = LIRTranslator (\s -> (a, s))
        m >>= f = LIRTranslator (\s ->
            let (a, s') = runLIR m s
            in runLIR (f a) s')
    

    along with the state that would be ‘threaded’ through the various translation phases:

    data Namespace = Namespace
        { temp         :: Int                       -- id's for new temporaries
        , labels       :: Int                       -- id's for new labels
        , scope        :: [(LIRLabel, LIRLabel)]    -- current program scope
        , encMethod    :: String                    -- current enclosing method
        , blockindex   :: [Int]                     -- index into the SymbolTree
        , successorMap :: Map.Map String [LIRLabel]
        , ivarStack    :: [(LIRReg, [CFGInst])]     -- stack of ivars (see motioned code)
        }
    

    For convenience, I also specified a series of translator monadic functions, for example:

    -- |Increment our translator's label counter
    incLabel :: LIRTranslator Int
    incLabel = LIRTranslator (\ns@(Namespace{ labels = l }) -> (l, ns{ labels = (l+1) }))
    

    I then proceeded to recursively pattern-match my AST, fragment-by-fragment, resulting in many functions of the form:

    translateBlock :: SymbolTree -> ASTBlock -> LIRTranslator [LIRInst]
    translateBlock st (DecafBlock _ [] _) = withBlock (return [])
    translateBlock st block =
        withBlock (do b <- getBlock
                      let st' = select b st
                      declarations <- mapM (translateVarDeclaration st') (blockVars block)
                      statements <- mapM (translateStm st') (blockStms block)
                      return (concat declarations ++ concat statements))
    

    (for translating a block of the target language’s code) or

    -- | Given a SymbolTree, Translate a single DecafMethodStm into [LIRInst]
    translateStm st (DecafMethodStm mc _) =
        do (instructions, operand) <- translateMethodCall st mc
           final <- motionCode instructions
           return final
    

    (for translating a method call) or

    translateMethodPrologue :: SymbolTree -> DecafMethod -> LIRTranslator [LIRInst]
    translateMethodPrologue st (DecafMethod _ ident args _ _) =
        do let numRegVars = min (length args) 6
               regvars = map genRegVar (zip [LRDI, LRSI, LRDX, LRCX, LR8, LR9] args)
           stackvars <- mapM genStackVar (zip [1..] (drop numRegVars args))
           return (regvars ++ stackvars)
      where
        genRegVar (reg, arg) =
            LIRRegAssignInst (symVar arg st) (LIROperExpr $ LIRRegOperand reg)
        genStackVar (index, arg) =
            do let mem = LIRMemAddr LRBP Nothing ((index + 1) * 8) qword -- ^ [rbp] = old rbp; [rbp + 8] = ret address; [rbp + 16] = first stack param
                                      return $ LIRLoadInst (symVar arg st) mem
    

    for an example of actually generating some LIR code. Hopefully these three examples will give you a good starting point; ultimately, you’ll want to go slowly, focusing on one fragment (or intermediate type) within your AST at a time.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm writing a compiler for a dataflow programming language I have designed. One of
I have the following ASM file generated by a compiler I'm writing: ; This
Im writing a compiler for university project, and I would like to transform my
I'm in the process of writing a compiler that will be generating ELF executable
I am currently in the process of writing a compiler and I seem to
I have the following declaration in my code: u32 volatile __attribute__((nocast)) *A, *B; Is
I had a discussion with Johannes Schaub regarding the keyword inline . The code
Part of my project is to write a text editor that is used for
Alright, I guess this question has been asked a lot of times here. I
I do mean the ??? in the title because I'm not exactly sure. Let

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.