Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8632647
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T09:22:52+00:00 2026-06-12T09:22:52+00:00

I was designing a regex to split all the actual words from a given

  • 0

I was designing a regex to split all the actual words from a given text:

Input Example:

"John's mom went there, but he wasn't there. So she said: 'Where are you'"

Expected Output:

["John's", "mom", "went", "there", "but", "he", "wasn't", "there", "So", "she", "said", "Where", "are", "you"]

I thought of a regex like that:

"(([^a-zA-Z]+')|('[^a-zA-Z]+))|([^a-zA-Z']+)"

After splitting in Python, the result contains None items and empty spaces.

How to get rid of the None items? And why didn’t the spaces match?


Edit:

Splitting on spaces, will give items like: ["there."]

And splitting on non-letters, will give items like: ["John","s"]

And splitting on non-letters except ', will give items like: ["'Where","you'"]

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T09:22:53+00:00Added an answer on June 12, 2026 at 9:22 am

    Instead of regex, you can use string-functions:

    to_be_removed = ".,:!" # all characters to be removed
    s = "John's mom went there, but he wasn't there. So she said: 'Where are you!!'"
    
    for c in to_be_removed:
        s = s.replace(c, '')
    s.split()
    

    BUT, in your example you do not want to remove apostrophe in John's but you wish to remove it in you!!'. So string operations fails in that point and you need a finely adjusted regex.

    EDIT: probably a simple regex can solve your porblem:

    (\w[\w']*)
    

    It will capture all chars that starts with a letter and keep capturing while next char is an apostrophe or letter.

    (\w[\w']*\w)
    

    This second regex is for a very specific situation…. First regex can capture words like you'. This one will aviod this and only capture apostrophe if is is within the word (not in the beginning or in the end). But in that point, a situation raises like, you can not capture the apostrophe Moss' mom with the second regex. You must decide whether you will capture trailing apostrophe in names ending wit s and defining ownership.

    Example:

    rgx = re.compile("([\w][\w']*\w)")
    s = "John's mom went there, but he wasn't there. So she said: 'Where are you!!'"
    rgx.findall(s)
    
    ["John's", 'mom', 'went', 'there', 'but', 'he', "wasn't", 'there', 'So', 'she', 'said', 'Where', 'are', 'you']
    

    UPDATE 2: I found a bug in my regex! It can not capture single letters followed by an apostrophe like A'. Fixed brand new regex is here:

    (\w[\w']*\w|\w)
    
    rgx = re.compile("(\w[\w']*\w|\w)")
    s = "John's mom went there, but he wasn't there. So she said: 'Where are you!!' 'A a'"
    rgx.findall(s)
    
    ["John's", 'mom', 'went', 'there', 'but', 'he', "wasn't", 'there', 'So', 'she', 'said', 'Where', 'are', 'you', 'A', 'a']
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

When designing a client/server architecture, is there any advantage to multiplexing multiple connections from
Is there any application for a regex split() operation that could not be performed
When designing an distributed application in Java there seem to be a few technologies
for designing purposes i need to truncate all DB which has lots of FK's.
Is designing for Google Tv is different from tablet. What should I need to
When designing Metro apps for Windows 8 with JavaScript, are all features available such
When designing Android layouts there is often a question - should you sacrifice readability
Designing a new system from scratch. I'll be using the STL to store lists
I've run into a bit of an issue designing RegEx in C#. I have
Designing a culture independent birthdate input consisting of three select: year, month and day.

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.