Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7557461
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T12:08:17+00:00 2026-05-30T12:08:17+00:00

I’ve been doing research on compilers. The lexer seems to be very straight forward:

  • 0

I’ve been doing research on compilers. The lexer seems to be very straight forward: Take a “sentence” and break it up into words (or tokens). To ensure correct grammar a parser is needed. The parser generally takes the tokens and builds a tree that results in a root node (words into sentences, paragraphs, pages, etc…).

From this question it would seem a parser would build an AST. The AST only contains what is necessary to execute the code, so things like parentheses would be unnecessary since operator precedence is built into an AST. An AST is probably all a compiler needs.

But what about converting code from one language to another? Taking a made-up language (grammar) or an existing grammar and converting it into another where operator precedence rules may or may not be different? Is operator precedence “built in” to CST as well?

As an example lets say I made up a language and wanted to translate it into PHP code. The ternary operator on most languages has right-to-left associativity. PHP incorrectly uses left-to-right associativity (see more about this here). I want “my language” to use right-to-left but the resulting PHP code has to apply parenthesis to get the correct result in PHP (with the link to Wikipedia, result needs to be “train” instead of “horse”).

So for language translation would a CST be better? Is operator precedence usually built into a CST? Is there anything in-between? Are there any examples comparing both trees with a simple algebra equation? Any examples illustrating a ternary operator?

(Is “transcode” the correct term for “programming language translation”? A Google search brings up converting media.)

What I’m trying to figure out is: When is it more appropriate to use one over the other?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T12:08:18+00:00Added an answer on May 30, 2026 at 12:08 pm

    An AST that models all the semantic details of the source language is all you need. By definition, if it does model the semantics correctly, and your langauge includes a ternary operator, then it will model the specific order in which the operators are applied (e.g, the results of the predecence modulo overrides such as parentheses) correctly too.

    So your problem isn’t in the AST. It is generating to another language using similar (ternary) operators whose precedence is different.

    This is an age-old problem in code generation: the operators of the target don’t quite match the operators of the source, and so the output can’t be one-to-one. In your case, you should be able to solve the problem by generating PHP ternary operators with parentheses around them to control the order to achieve what the original semantics are, so this isn’t a big problem.

    In general, generating sequences of code that achieve a desired result can be pretty complicated, and there’s lots of ways to do it. That’s why the compiler books are thick rather than thin. You seem to have implicitly settled on “get AST, walk AST, spit code”; this is almost an on-the-fly code generator. And this works adequately if you don’t care if the generated code is particularly good, and the target language is pretty close to the source language.

    If the code generation problem is more complex, what normally happens is the AST is used to generate what amounts to a data flow model of the computation, consisting of operators that produce results, and consume results from previous operators, grounding out in “operators” that fetch variable values and constants. Then the data flow representation is traversed to generate the code; this has the advantage that you can choose an operator in the data flow representation, find a matching code sequence in the target language, generate that, and then worry about how the operands are collected. Better schemes match data flow subgraphs (representing equivalent compound target language constructs) to the produced data flow graph; this can produce significantly better code. Often, one can apply target-language specific optimizations after raw code generation to produce even better code. In both cases, you have to worry about managing the operator results; can they be fed directly to the next target language operator, or must they go into some kind of temporary storage (for machine code, this can be another register or a memory location). Doing all this isn’t easy; again, that’s why the compiler books aren’t thin.

    A variation on this idea is source-to-source program transformations. This maps constructs in source code “directly” to constructs in target code, although this is usually done behind the scenes by operating on ASTs because unparsed programming language text is hard to match. Our DMS Software Reengineering Toolkit is an example of this kind of system. With such a tool, you write patterns in the source language (which implicitly match against a parse tree), and corresponding patterns in the target langauge (implicitly producing target language ASTs). You can write complex source or target constructs giving much of the effect of the data flow graph matching above. Post-generation optimization consists of more rewrite rules that transform the target code to target code.

    Bottom line: Having an AST isn’t really enough unless your translation is truly trivial.
    You can read more about what you do need at this SO answer: https://stackoverflow.com/a/3460977/120163

    WARNING: Loud opinion follows.

    Re “Transcoder”: I much prefer the term “compilation”, “translation”, or “source-to-source” compiler. I’ve been building program analysis and manipulation tools for almost 40 years. I’d never heard the term “transcoder” until I encountered this SO question: Experience migrating legacy Cobol/PL1 to Java and a response describing IMHO a truly awful code translation scheme called NACA. I’ve heard since that this term is gaining traction; I don’t see why we had to invent another term when we have perfectly adequate ones. Usually this is a sign of somebody inventing a high priesthood; “let’s invent a shiny new term so people don’t really understand what we’re doing”. I’m happy to leave that term to such truly awful translations.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have a jquery bug and I've been looking for hours now, I can't
I have a string like this: La Torre Eiffel paragonata all’Everest What PHP function
this is what i have right now Drawing an RSS feed into the php,
I am doing a simple coin flipping experiment for class that involves flipping a
I would like to run a str_replace or preg_replace which looks for certain words
I have a French site that I want to parse, but am running into
I am currently running into a problem where an element is coming back from
I'm parsing an RSS feed that has an ’ in it. SimpleXML turns this
I ran into a problem. Wrote the following code snippet: teksti = teksti.Trim() teksti

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.