Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1007593
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 16, 20262026-05-16T08:38:28+00:00 2026-05-16T08:38:28+00:00

I was wondering how stackoverflow parses all sorts of different code and identifies keywords,

  • 0

I was wondering how stackoverflow parses all sorts of different code and identifies keywords, special characters, whitespace formatting, etc. It does this for most code I believe, and I’ve noticed it’s even sophisticated enough to understand the relationships between everything it parses, like so:

String mystring1 = "inquotes"; //incomment
String mystring2 = "inquotes//incomment";
String mystring3 = //incomment"inquotes";

Many IDEs do this also. How is this done?

Edit: Further explaination – I am not asking about the parsing of the text, my question is, once I am past that part.. is there something like a universal XML Scheme, or cross-code format hierarchy that describes which strings are keywords, which characters denote comments, text strings, logic operators, etc. Or must I become a syntax guru for any language I wish to parse accurately?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-16T08:38:29+00:00Added an answer on May 16, 2026 at 8:38 am

    To really have your IDE/compiler/interpreter “understand” and colorize code you’ll need to parse it and pull out the different syntactical parts. The classic reference for this is the Dragon Book, “Compilers: Principles, Techniques, and Tools.” You can see some of the difficulty in constructs like this

    i+++++i; 
    

    or

    list<list<hash<list<int>,hash<int,<list>>>>>;
    //or just matching parens 
    

    Properly doing this is a hard problem. Some languages, like java, make this easier than others, such as C and C++ (which both have standards) or ruby (which doesn’t even have a spec and relies on the implementation as a spec). However, if you only want to do a few bits of highlighting you can skip large parts of the grammar and get an 80% solution more easily. I suspect that the SO engine knows about strings and a few different types of comments and this does well enough for their purpose.

    The difficulty between 80% and 100% is one reason that most IDEs have syntax highlighting for C++ but Visual C++ still doesn’t have C++ refactoring support. For highlighting a few mistakes are probably OK. When you’re refactoring you need to really understand variable scope in different namespaces and all sorts of pointer stuff too.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Hey Everyone, I was wondering what methods are standard within the industry to do
I know stackoverflow uses OpenID authentication. I want to try and use this as
For great help from stackoverflow, the development for the Mac version of my program
I'm wondering how to make the effect of the tan rectangle as you scroll
Hello guys I was wondering why some websites which uses php (or any other
I am wondering if anyone has any thoughts on the best way to perform
Creating a voting system just like stackoverflow. Questions like this have been asking several
I was wondering if there is any good choice of rails plugin for facebook
I'm new to MVC2 and was wondering if it was possible to pass an
I'm trying to get the base url (for example: https://stackoverflow.com/ ) without using javascript.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.