Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8204087
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T07:47:05+00:00 2026-06-07T07:47:05+00:00

I have a text file containing entries like this: @markwarner VIRGINIA – Mark Warner

  • 0

I have a text file containing entries like this:

@markwarner VIRGINIA - Mark Warner 
@senatorleahy VERMONT - Patrick Leahy NO 
@senatorsanders VERMONT - Bernie Sanders 
@orrinhatch UTAH - Orrin Hatch NO 
@jimdemint SOUTH CAROLINA - Jim DeMint NO 
@senmikelee UTAH -- Mike Lee 
@kaybaileyhutch TEXAS - Kay Hutchison 
@johncornyn TEXAS - John Cornyn 
@senalexander TENNESSEE - Lamar Alexander

I have written the following to remove the ‘NO’ and the dashes using regular expressions:

import re

politicians = open('testfile.txt')
text = politicians.read()

# Grab the 'no' votes
# Should be 11 entries
regex = re.compile(r'(no\s@[\w+\d+\.]*\s\w+\s?\w+?\s?\W+\s\w+\s?\w+)', re.I)
no = regex.findall(text)

## Make the list a string
newlist = ' '.join(no)

## Replace the dashes in the string with a space
deldash = re.compile('\s-*\s')
a = deldash.sub(' ', newlist)

# Delete 'NO' in the string
delno = re.compile('NO\s')
b = delno.sub('', a)

# make the string into a list
# problem with @jimdemint SOUTH CAROLINA Jim DeMint
regex2 = re.compile(r'(@[\w\d\.]*\s[\w\d\.]*\s?[\w\d\.]\s?[\w\d\.]*?\s+?\w+)', re.I)
lst1 = regex2.findall(b)

for i in lst1:
    print i

When I run the code, it captures the twitter handle, state and full names other than the surname of Jim DeMint. I have stated that I want to ignore case for the regex.

Any ideas? Why is the expression not capturing this surname?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T07:47:07+00:00Added an answer on June 7, 2026 at 7:47 am

    It’s missing it because his state name contains two words: SOUTH CAROLINA

    Have your second regex be this, it should help

     (@[\w\d\.]*\s[\w\d\.]*\s?[\w\d\.]\s?[\w\d\.]*?\s+?\w+(?:\s\w+)?)
    

    I added

    (?:\s\w+)?
    

    Which is a optional, non capturing group matching a space followed by one or more alphanumeric underscore characters

    http://regexr.com?31fv5 shows that it properly matches the input with the NOs and dashes stripped

    EDIT:
    If you want one master regex to capture and split everything properly, after you remove the Nos and dashes, use

    ((@[\w]+?\s)((?:(?:[\w]+?)\s){1,2})((?:[\w]+?\s){2}))
    

    Which you can play with here: http://regexr.com?31fvk

    The full match is available in $1, the Twitter handle in $2, the State in $3 And the name in $4

    Each capturing group works as follows:

    (@[\w]+?\s)
    

    This matches an @ sign followed by at least one but as few characters as possible until a space.

    ((?:(?:[\w]+?)\s){1,2})
    

    This matches and captures 1 or two words, which should be the state. This only works because of the next piece, which MUST have two words

    ((?:[\w]+?\s){2})
    

    Matches and captures exactly two words, which is defined as few characters as possible followed by a space

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have a text file containing data like this: This is just text -------------------------------
I have 1 text file containing Ips .Like this iptextfile.txt 10.0.0.1 192.168.1.1 123.123.123.123 And
I have a text file containing words separated by newline , like the following
I have to filter a text file filter.tmp containing two types of lines, this
I have a text file containing domains like ABC.COM ABC.COM DEF.COM DEF.COM XYZ.COM i
I have a text file containing text like: ['22APR2012 23:10', '23APR2012 07:10', 1, 3,
I have a text file containing the vertices of a triangle. I would like
I have a text file containing 10 columns of numbers. What I would like
This is my situation: I have a text file containing a lot of equal-length
Hi I have a text file containing two arrays and one value(all integers) like

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.