I am new to python and was wondering if there was a better solution

Question

0

Asked: June 16, 20262026-06-16T15:48:58+00:00 2026-06-16T15:48:58+00:00

I am new to python and was wondering if there was a better solution

0

I am new to python and was wondering if there was a better solution to match all forms of URLs that might be found in a given string. Upon googling, there seems to a lot of solutions that extract domains, replace it with links etc, but none that removes / deletes them from a string. I have mentioned some examples below for reference. Thanks!

str = 'this is some text that will have one form or the other url embeded, most will have valid URLs while there are cases where they can be bad. for eg, http://www.google.com and http://www.google.co.uk and www.domain.co.uk and etc.'

URLless_string = re.sub(r'(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|

(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))', '', thestring)

print '==' + URLless_string + '=='

Error Log:

C:\Python27>python test.py
  File "test.py", line 7
SyntaxError: Non-ASCII character '\xab' in file test.py on line 7, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-16T15:49:00+00:00

Include encoding line at the top of your source file(the regex string contains non-ascii symbols like »), e.g.:

# -*- coding: utf-8 -*-
import re
...

Also surround your regex string in triple single(or double)quotes – ''' or """ instead of single as this string already contains quote symbols itself(' and ").

r'''(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))'''

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am new to python and was wondering if there was a better solution

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply