I have a working pattern for english language but for my native language is

Question

0

Asked: June 4, 20262026-06-04T09:57:25+00:00 2026-06-04T09:57:25+00:00

I have a working pattern for english language but for my native language is

0

I have a working pattern for english language but for my native language is not working and it give me headaches. First of all i have opened many question about encoding, and i know that i underestimated it, it was a big problem. I spent some time reading about it, and the problem is still there. So now i am facing a regular expression utf problem. So the pattern is:

exactMatch = re.compile(r"([^\.]*\bтурција\b[^\.]*)\.", re.UNICODE)
print exactMatch.pattern
result= exactMatch.findall("турција е на врвот од индустријата. турција е на врвот од индустријата.")

It works for english language. It function is to give me all sentences in a paragraph. So any suggestions?

I have also tried with encode and decode but noting happens except encoding error.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T09:57:26+00:00

Editorial Team

2026-06-04T09:57:26+00:00Added an answer on June 4, 2026 at 9:57 am

this will work:

exactMatch = re.compile(ur"([^\.]*\bтурција\b[^\.]*)\.", re.UNICODE)
print exactMatch.pattern
result= exactMatch.findall(u"турција е на врвот од индустријата. турција е на врвот од индустријата.")

if you use unicode, then use unicode.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a working pattern for english language but for my native language is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply