Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7585743
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 30, 20262026-05-30T19:14:55+00:00 2026-05-30T19:14:55+00:00

I was trying to remove all comments and empty lines in a file with

  • 0

I was trying to remove all comments and empty lines in a file with the help of a macro. Now I came up with this solution which deletes the comments(there is some bug described below) but is not able to delete the blank lines in between –

Sub CleanCode()
    Dim regexComment As String = "(REM [\d\D]*?[\r\n])|(?<SL>\'[\d\D]*?[\r\n])"
    Dim regexBlank As String = "^[\s|\t]*$\n"
    Dim replace As String = ""

    Dim selection As EnvDTE.TextSelection = DTE.ActiveDocument.Selection
    Dim editPoint As EnvDTE.EditPoint

    selection.StartOfDocument()
    selection.EndOfDocument(True)

    DTE.UndoContext.Open("Custom regex replace")
    Try
        Dim content As String = selection.Text
        Dim resultComment As String = System.Text.RegularExpressions.Regex.Replace(content, regexComment, replace)
        Dim resultBlank As String = System.Text.RegularExpressions.Regex.Replace(resultComment, regexBlank, replace)
        selection.Delete()
        selection.Collapse()
        Dim ed As EditPoint = selection.TopPoint.CreateEditPoint()
        ed.Insert(resultBlank)
    Catch ex As Exception
        DTE.StatusBar.Text = "Regex Find/Replace could not complete"
    Finally
        DTE.UndoContext.Close()
        DTE.StatusBar.Text = "Regex Find/Replace complete"
    End Try
End Sub

So, here is what it should looks like before and after running the macro.

BEFORE

Public Class Class1
    Public Sub New()
        ''asdasdas
        Dim a As String = "" ''asdasd
        ''' asd ad asd
    End Sub


    Public Sub New(ByVal strg As String)

        Dim a As String = ""

    End Sub


End Class

AFTER

Public Class Class1
    Public Sub New()
        Dim a As String = ""
    End Sub
    Public Sub New(ByVal strg As String)
        Dim a As String = ""
    End Sub
End Class

There are mainly two main problems with the macro

  • It cannot delete the blank lines in between.
  • If there is a piece of code which goes like this

Dim a as String = "Name='Soham'"

Then After running the macro it becomes

Dim a as String = "Name='"
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-30T19:14:57+00:00Added an answer on May 30, 2026 at 7:14 pm

    To get rid of a line that contains whitespace or nothing, you can use this regex:

    (?m)^[ \t]*[\r\n]+
    

    Your regex, ^[\s|\t]*$\n would work if you specified Multiline mode ((?m)), but it’s still incorrect. For one thing, the | matches a literal |; there’s no need to specify “or” in a character class. For another, \s matches any whitespace character, including TAB (\t), carriage-return (\r), and linefeed (\n), making it needlessly redundant and inefficient. For example, at the first blank line (after the end of the first Sub), the ^[\s|\t]* will initially try to match everything before the word Public, then it will back off to the end of the previous line, where the $\n can match.

    But a blank line, in addition to being empty or containing only horizontal whitespace (spaces or TABs), may also contain a comment. I choose to treat these “comment-only” lines as blank lines because it’s relatively easy to do, and it simplifies the task of matching comments in non-blank lines, which is much harder. Here’s my regex:

    ^[ \t]*(?:(?:REM|')[^\r\n]*)?[\r\n]+
    

    After consuming any leading horizontal whitespace, if I see a REM or ' signifying a comment, I consume that and everything after it until the next line separator. Notice that the only thing that’s required to be present is the line separator itself. Also notice the absence of the end anchor, $. It’s never necessary to use that when you’re explicitly matching the line separators, and in this case it would break the regex. In Multiline mode, $ matches only before a linefeed (\n), not before a carriage-return (\r). (This behavior of the .NET flavor is incorrect and rather surprising, given Microsoft’s longstanding preference for \r\n as a line separator.)

    Matching the remaining comments is a fundamentally different task. As you’ve discovered, simply searching for REM or ' is no good because you might find it in a string literal, where it does not signify the start of a comment. What you have to do is start from the beginning of the line, consuming and capturing anything that’s not the beginning of a comment or a string literal. If you find a double-quote, go ahead and consume the string literal. If you find a REM or ', stop capturing and go ahead and consume the rest of the line. Then you replace the whole line with just the captured portion–i.e., everything before the comment. Here’s the regex:

    (?mn)^(?<line>[^\r\n"R']*(("[^"]*"|(?!REM)R)[^\r\n"R']*)*)(REM|')[^\r\n]*
    

    Or, more readably:

    (?mn)             # Multiline and ExplicitCapture modes
    ^                 # beginning of line
    (?<line>          # capture in group "line"
      [^\r\n"R']*     # any number of "safe" characters
      (
        (
          "[^"]*"     # a string literal
          |
          (?!REM)R    # 'R' if it's not the beginning of 'REM'
        )
        [^\r\n"R']*   # more "safe" characters
      )*
    )                 # stop capturing
    (?:REM|')         # a comment sigil
    [^\r\n]*          # consume the rest of the line
    

    The replacement string would be "${line}". Some other notes:

    • Notice that this regex does not end with [\r\n]+ to consume the line separator, like the “blank lines” regex does.
    • It doesn’t end with $ either, for the same reason as before. The [^\r\n]* will greedily consume everything before the line separator, so the anchor isn’t needed.
    • The only thing that’s required to be present is the REM or '; we don’t bother matching any line that doesn’t contain a comment.
    • ExplicitCapture mode means I can use (...) instead of (?:...) for all the groups I don’t want to capture, but the named group, (?<line>...), still works.
    • Gnarly as it is, this regex would be a lot worse if VB supported multiline comments, or if its string literals supported backslash escapes.

    I don’t do VB, but here’s a demo in C#.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I was trying to remove all the lines of a file except the last
I'm trying to remove all white space and code comments from my .php files
I'm trying to remove all but the first child component from a Java Container.
I am trying to remove all HTML elements from a String. Unfortunately, I cannot
I'm trying to remove all of the leaves. I know that leaves have no
Because regular expressions scare me, I'm trying to find a way to remove all
I am trying to remove a pending changelist in perforce. All the files (20
I am trying to remove all the non-word character in a string, but want
I'm trying to remove all BBCode Tags from a string. [url]www.google.com[/url] becomes www.google.com I
I'm trying to do remove JavaScript comments via a regular expression in C# and

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.