Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9015125
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 16, 20262026-06-16T03:40:24+00:00 2026-06-16T03:40:24+00:00

From the python documentation on regex , regarding the ‘\’ character: The solution is

  • 0

From the python documentation on regex, regarding the '\' character:

The solution is to use Python’s raw string notation for regular
expression patterns; backslashes are not handled in any special way in
a string literal prefixed with 'r'. So r"\n" is a two-character string
containing '\' and 'n', while "\n" is a one-character string
containing a newline. Usually patterns will be expressed in Python
code using this raw string notation.

What is this raw string notation? If you use a raw string format, does that mean "*" is taken as a a literal character rather than a zero-or-more indicator? That obviously can’t be right, or else regex would completely lose its power. But then if it’s a raw string, how does it recognize newline characters if "\n" is literally a backslash and an "n"?

I don’t follow.

Edit for bounty:

I’m trying to understand how a raw string regex matches newlines, tabs, and character sets, e.g. \w for words or \d for digits or all whatnot, if raw string patterns don’t recognize backslashes as anything more than ordinary characters. I could really use some good examples.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-16T03:40:25+00:00Added an answer on June 16, 2026 at 3:40 am

    Zarkonnen’s response does answer your question, but not directly. Let me try to be more direct, and see if I can grab the bounty from Zarkonnen.

    You will perhaps find this easier to understand if you stop using the terms "raw string regex" and "raw string patterns". These terms conflate two separate concepts: the representations of a particular string in Python source code, and what regular expression that string represents.

    In fact, it’s helpful to think of these as two different programming languages, each with their own syntax. The Python language has source code that, among other things, builds strings with certain contents, and calls the regular expression system. The regular expression system has source code that resides in string objects, and matches strings. Both languages use backslash as an escape character.

    First, understand that a string is a sequence of characters (i.e. bytes or Unicode code points; the distinction doesn’t much matter here). There are many ways to represent a string in Python source code. A raw string is simply one of these representations. If two representations result in the same sequence of characters, they produce equivalent behaviour.

    Imagine a 2-character string, consisting of the backslash character followed by the n character. If you know that the character value for backslash is 92, and for n is 110, then this expression generates our string:

    s = chr(92)+chr(110)
    print len(s), s
    
    2 \n
    

    The conventional Python string notation "\n" does not generate this string. Instead it generates a one-character string with a newline character. The Python docs 2.4.1. String literals say, "The backslash (\) character is used to escape characters that otherwise have a special meaning, such as newline, backslash itself, or the quote character."

    s = "\n"
    print len(s), s
    
    1 
     
    

    (Note that the newline isn’t visible in this example, but if you look carefully, you’ll see a blank line after the "1".)

    To get our two-character string, we have to use another backslash character to escape the special meaning of the original backslash character:

    s = "\\n"
    print len(s), s
    
    2 \n
    

    What if you want to represent strings that have many backslash characters in them? Python docs 2.4.1. String literals continue, "String literals may optionally be prefixed with a letter ‘r’ or ‘R’; such strings are called raw strings and use different rules for interpreting backslash escape sequences." Here is our two-character string, using raw string representation:

    s = r"\n"
    print len(s), s
    
    2 \n
    

    So we have three different string representations, all giving the same string, or sequence of characters:

    print chr(92)+chr(110) == "\\n" == r"\n"
    True
    

    Now, let’s turn to regular expressions. The Python docs, 7.2. re — Regular expression operations says, "Regular expressions use the backslash character (‘\’) to indicate special forms or to allow special characters to be used without invoking their special meaning. This collides with Python’s usage of the same character for the same purpose in string literals…"

    If you want a Python regular expression object which matches a newline character, then you need a 2-character string, consisting of the backslash character followed by the n character. The following lines of code all set prog to a regular expression object which recognises a newline character:

    prog = re.compile(chr(92)+chr(110))
    prog = re.compile("\\n")
    prog = re.compile(r"\n")
    

    So why is it that "Usually patterns will be expressed in Python code using this raw string notation."? Because regular expressions are frequently static strings, which are conveniently represented as string literals. And from the different string literal notations available, raw strings are a convenient choice, when the regular expression includes a backslash character.

    Questions

    Q: what about the expression re.compile(r"\s\tWord")? A: It’s easier to understand by separating the string from the regular expression compilation, and understanding them separately.

    s = r"\s\tWord"
    prog = re.compile(s)
    

    The string s contains eight characters: a backslash, an s, a backslash, a t, and then four characters Word.

    Q: What happens to the tab and space characters? A: At the Python language level, string s doesn’t have tab and space character. It starts with four characters: backslash, s, backslash, t . The regular expression system, meanwhile, treats that string as source code in the regular expression language, where it means "match a string consisting of a whitespace character, a tab character, and the four characters Word.

    Q: How do you match those if that’s being treated as backlash-s and backslash-t? A: Maybe the question is clearer if the words ‘you’ and ‘that’ are made more specific: how does the regular expression system match the expressions backlash-s and backslash-t? As ‘any whitespace character’ and as ‘tab character’.

    Q: Or what if you have the 3-character string backslash-n-newline? A: In the Python language, the 3-character string backslash-n-newline can be represented as conventional string "\\n\n", or raw plus conventional string r"\n" "\n", or in other ways. The regular expression system matches the 3-character string backslash-n-newline when it finds any two consecutive newline characters.

    N.B. All examples and document references are to Python 2.7.

    Update: Incorporated clarifications from answers of @Vladislav Zorov and @m.buettner, and from follow-up question of @Aerovistae.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Python documentation from http://docs.python.org/library/string.html : string.lstrip(s[, chars]) Return a copy of the string with
The Python documentation for except says: For an except clause with an expression, that
I was going through the Python documentation regarding Extending Python with C/C++ , and
I can't tell from the Python documentation whether the re.compile(x) function may throw an
Sorry, but I can't figure this out from the Python documentation or any of
The Python documentation says that sys.path is Initialized from the environment variable PYTHONPATH ,
Following the Python documentation for string.replace ( http://docs.python.org/library/string.html ): string.replace(str, old, new[, maxreplace]) Return
I am trying to use VTK from python. I tried to find and could
I'm a little confused by the following example from the python documentation here .
New to python and trying to understand multi-threading. Here's an example from python documentation

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.