Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 763381
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 14, 20262026-05-14T16:33:05+00:00 2026-05-14T16:33:05+00:00

One particular quirk of the (otherwise quite powerful) re module in Python is that

  • 0

One particular quirk of the (otherwise quite powerful) re module in Python is that re.split() will never split a string on a zero-length match, for example if I want to split a string along word boundaries:

>>> re.split(r"\s+|\b", "Split along words, preserve punctuation!")
['Split', 'along', 'words,', 'preserve', 'punctuation!']

instead of

['', 'Split', 'along', 'words', ',', 'preserve', 'punctuation', '!']

Why does it have this limitation? Is it by design? Do other regex flavors behave like this?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-14T16:33:05+00:00Added an answer on May 14, 2026 at 4:33 pm

    It’s a design decision that was made, and could have gone either way. Tim Peters made this post to explain:

    For example, if you split “abc” by the pattern x*, what do you
    expect? The pattern matches (with length 0) at 4 places,
    but I bet most people would be surprised to get

    [”, ‘a’, ‘b’, ‘c’, ”]

    back instead of (as they do get)

    [‘abc’]

    Some others disagree with him though. Guido van Rossum doesn’t want it changed due to backwards compatibility issues. He did say:

    I’m okay with adding a flag to enable this behavior though.

    Edit:

    There is a workaround posted by Jan Burgy:

    >>> s = "Split along words, preserve punctuation!"
    >>> re.sub(r"\s+|\b", '\f', s).split('\f')
    ['', 'Split', 'along', 'words', ',', 'preserve', 'punctuation', '!']
    

    Where '\f' can be replaced by any unused character.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

What MySQL query will do a text search and replace in one particular field
Problem: I have two spreadsheets that each serve different purposes but contain one particular
My question is about one particular usage of static keyword. It is possible to
I want to search a table to find all rows where one particular field
Say I'm interested in the source for one particular Linux utility, like factor .
I have a MS SQL DB with various tables, and one field in particular
Is there any particular reason to use one over the other? I personally tend
Currently I monitoring a particular file with a simple shell one-liner: filesize=$(ls -lah somefile
I haven't found an answer to this particular question; perhaps there isn't one. But
I keep wondering how does a debugger work? Particulary the one that can be

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.