I’m parsing a source code file, and I want to remove all line comments

Question

0

Asked: May 11, 20262026-05-11T16:54:22+00:00 2026-05-11T16:54:22+00:00

I’m parsing a source code file, and I want to remove all line comments

0

I’m parsing a source code file, and I want to remove all line comments (i.e. starting with “//”) and multi-line comments (i.e. /…./). However, if the multi-line comment has at least one line-break in it (\n), I want the output to have exactly one line break instead.

For example, the code:

qwe /* 123
456 
789 */ asd

should turn exactly into:

qwe
asd

and not “qweasd” or:

qwe

asd

What would be the best way to do so?
Thanks

EDIT:
Example code for testing:

comments_test = "hello // comment\n"+\
                "line 2 /* a comment */\n"+\
                "line 3 /* a comment*/ /*comment*/\n"+\
                "line 4 /* a comment\n"+\
                "continuation of a comment*/ line 5\n"+\
                "/* comment */line 6\n"+\
                "line 7 /*********\n"+\
                "********************\n"+\
                "**************/\n"+\
                "line ?? /*********\n"+\
                "********************\n"+\
                "********************\n"+\
                "********************\n"+\
                "********************\n"+\
                "**************/\n"+\
                "line ??"

Expected results:

hello 
line 2 
line 3  
line 4
line 5
line 6
line 7
line ??
line ??

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T16:54:22+00:00

comment_re = re.compile(
    r'(^)?[^\S\n]*/(?:\*(.*?)\*/[^\S\n]*|/[^\n]*)($)?',
    re.DOTALL | re.MULTILINE
)

def comment_replacer(match):
    start,mid,end = match.group(1,2,3)
    if mid is None:
        # single line comment
        return ''
    elif start is not None or end is not None:
        # multi line comment at start or end of a line
        return ''
    elif '\n' in mid:
        # multi line comment with line break
        return '\n'
    else:
        # multi line comment without line break
        return ' '

def remove_comments(text):
    return comment_re.sub(comment_replacer, text)

(^)? will match if the comment starts at the beginning of a line, as long as the MULTILINE-flag is used.
[^\S\n] will match any whitespace character except newline. We don’t want to match line breaks if the comment starts on it’s own line.
/\*(.*?)\*/ will match a multi-line comment and capture the content. Lazy matching, so we don’t match two or more comments. DOTALL-flag makes . match newlines.
//[^\n] will match a single-line comment. Can’t use . because of the DOTALL-flag.
($)? will match if the comment stops at the end of a line, as long as the MULTILINE-flag is used.

Examples:

>>> s = ("qwe /* 123\n"
         "456\n"
         "789 */ asd /* 123 */ zxc\n"
         "rty // fgh\n")
>>> print '"' + '"\n"'.join(
...     remove_comments(s).splitlines()
... ) + '"'
"qwe"
"asd zxc"
"rty"
>>> comments_test = ("hello // comment\n"
...                  "line 2 /* a comment */\n"
...                  "line 3 /* a comment*/ /*comment*/\n"
...                  "line 4 /* a comment\n"
...                  "continuation of a comment*/ line 5\n"
...                  "/* comment */line 6\n"
...                  "line 7 /*********\n"
...                  "********************\n"
...                  "**************/\n"
...                  "line ?? /*********\n"
...                  "********************\n"
...                  "********************\n"
...                  "********************\n"
...                  "********************\n"
...                  "**************/\n")
>>> print '"' + '"\n"'.join(
...     remove_comments(comments_test).splitlines()
... ) + '"'
"hello"
"line 2"
"line 3 "
"line 4"
"line 5"
"line 6"
"line 7"
"line ??"
"line ??"

Edits:

Updated to new specification.
Added another example.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m parsing a source code file, and I want to remove all line comments

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply