I have had problems in Regexes to divide a code up into functional components. They can break or it can take a long time for them to finish. The experience raises a question:
‘When should I use a parser?’
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
You should use a parser when you are interested in the lexical or semantic meaning of text, when patterns can vary. Parsers are generally overkill when you are simply looking to match or replace patterns of characters, regardless of their functional meaning.
In your case, you seem to be interested in the meaning behind the text (‘functional components’ of code), so a parser would be the better choice. Parsers can, however, internally make use of regex, so they should not be regarded as mutually exclusive.
A ‘parser’ does not automatically mean it has to be complicated, however. For example, if you are interested in C code blocks, you could simply parse nested groups of { and }. This parser would only be interested in two tokens (‘{‘ and ‘}’) and the blocks of text between them.
However, a simple regex comparison is not sufficient here because of the nested semantics. Take the following code:
A parser will understand the overall scope of Foo, as well as each inner scope contained within Foo (the if and else blocks). As it encounters each ‘{‘ token, it ‘understands’ their meaning. A simple search, however does not understand the meaning behind the text and may interpret the following to be a block, which we of course know is not correct: