Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7722325
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T04:08:32+00:00 2026-06-01T04:08:32+00:00

This simple problem is kiling me. I posted something earlier about trying to clean

  • 0

This simple problem is kiling me. I posted something earlier about trying to clean up a database of addresses and somebody suggested GeoPy to check the validity of the addresses. Great tool which I did not know, but before doing that, I need to clean up the database a little bit, since geopy will not deal with messy formatting.
The solution is to use regular expressions, which I think I have sort of fixed for most of the types of addresses I seen in the database.
Nevertheless, I am having problems with the last RegExp I defined (called r4 in the code), because it is retuning part of the first parenthesis which I don’t need, and I don’t know why I have a extra white spaces when it returns the last group (City: London, Country: England).
Can anybody help?

import re

r1 = '\s*ForeignZip.*--\s*([\d\.]+)'
r2 = '(\w+)\W*,\W*(\w*)'
r3 = '(?<=\().*?(?=\))'
r4 = '(\w+\W\()'

Location = ['   ForeignZip (xxx) -- 734.450','Washington, DC.','London (England)']

for item in Location:
    print item
    match1 = re.search(r1,item)
    match2 = re.search(r2,item)
    match3 = re.search(r3,item)
    match4 = re.search(r4,item)

    if match1:
        print 'pattern 1 found:', match1.group(1)

    elif match2:
        print 'pattern 2 found: City :' + match2.group(1) + ", State :" + match2.group(2)

    elif match3:
        print 'pattern 3 found: City: ', match4.group() + ", Country :" + match3.group(0)

    else:
        print 'no match'

This returns

   ForeignZip (xxx) -- 734.450
pattern 1 found: 734.50
Washington, DC.
pattern 2 found: City :Washington, State :DC
London (England)
pattern 3 found: City:    London (, Country :England
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T04:08:34+00:00Added an answer on June 1, 2026 at 4:08 am

    Just a little changing of your later regexes is necessary … There are probably a million ways to do this, but here is one.:

    r3 = r'(\w+)\s+\((\w+)\)'   #Match a word (group1), whitespace followed by a '(' then another word (group2) and finally a closing ')'
    

    Or to make whitespace completely insignificant:

    r3 = r'(\s*(?:\w+\s*)*)\s*\(\s*((?:\w+\s*)+)\s*\)'
    

    which basically is the previous regex except it replaces \w+ with (?:\w+\s*)* which allows multiple words to be matched, but doesn’t capture them — it leaves the “groups” the same since (?:...) never saves the string it matched anywhere.

    and now change the third test to:

    elif match3:
        print 'pattern 3 found: City : '+ match3.group(1) + ", Country :" + match3.group(2)
    

    I also removed r4 since it isn’t necessary anymore… (Also changed the ‘,’ to a ‘+’ for consistency and added a space in ‘City:’)

    Also note that when dealing with regex, it is often nice to use “raw” strings (this prevents python from mangling tokens in your string. To test the difference, try:

    print ("\n")  #prints newline
    print (r"\n") #prints "\n"
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I thought I was pretty knowledgeable about CSS but this simple problem baffles me.
Been struggling with this simple selector problem a couple of hours now and must
I can't seem to find solutions to this supposedly simple problem /bug, so here
Maybe this is a simple problem, but I am just not seeing it :)
I'm guessing this is a simple problem, but I'm just learning... I have this:
This is such a simple problem, but I can't find an answer anywhere... I
this must be such a simple problem but can someone tell me why this
This seems like a simple problem: I have a WF4 activity that guides the
This is a seemingly simple problem but I am having trouble doing it in
This one seems to be a simple problem, but I can't make it work

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.