I need help on some regex problem with chinese characters in python. 拉柏多公园 is

Question

0

Asked: May 24, 20262026-05-24T07:17:49+00:00 2026-05-24T07:17:49+00:00

I need help on some regex problem with chinese characters in python. 拉柏多公园 is

0

I need help on some regex problem with chinese characters in python.

“拉柏多公园” is the correct form of the word, but in a text i found “拉柏多公园”, what regex should i use to replace the characters.

import re

name = "拉柏多公园"
line = "whatever whatever it is then there comes a 拉柏 多公 园 sort of thing"
line2 = "whatever whatever it is then there comes another拉柏 多公 园 sort of thing"
line3 = "whatever whatever it is then there comes yet another 拉柏 多公 园sort of thing"
line4 = "whatever whatever it is then there comes a拉柏 多公 园sort of thing"

firstchar = "拉"
lastchar = "园"

i need to replace the strings in the lines so that the output line will look like this

line = "whatever whatever it is then there comes a 拉柏多公园 sort of thing"
line2 = "whatever whatever it is then there comes another 拉柏多公园 sort of thing"
line3 = "whatever whatever it is then there comes yet another 拉柏多公园 sort of thing"
line4 = "whatever whatever it is then there comes a 拉柏多公园 sort of thing"

i tried these to but the regex is badly structured:

reline = line.replace (r"firstchar*lastchar", name) #
reline2 = reline.replace ("  ", " ")
print reline2

can someone help to correct my regex?

Thanks

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T07:17:50+00:00

(I assume you’re using python 3, since you’re using unicode characters in regular strings. For python 2, add u before each string literal.)

Python 3

import re

name = "拉柏多公园"
# the string of Chinese characters, with any number of spaces interspersed.
# The regex will match any surrounding spaces.
regex = r"\s*拉\s*柏\s*多\s*公\s*园\s*"

So you can replace each string with

reline = re.sub(regex, ' ' + name + ' ', line)

Python 2

# -*- coding: utf-8 -*-

import re

name = u"拉柏多公园"
# the string of Chinese characters, with any number of spaces interspersed.
# The regex will match any surrounding spaces.
regex = ur"\s*拉\s*柏\s*多\s*公\s*园\s*"

So you can replace each string with

reline = re.sub(regex, u' ' + name + u' ', line)

Discussion

The result will be surrounded by spaces. More generally, if you want it to work at the start or end of the line, or before commas or periods, you’ll have to replace ' ' + name + ' ' with something more sophisticated.

Edit: fixed. Of course, you have to use the re library function.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I need help on some regex problem with chinese characters in python. 拉柏多公园 is

Leave an answerCancel reply

1 Answer

Python 3

Python 2

Discussion

Leave an answer
Cancel reply