I am attempting to write a line of code that will take a line

Question

0

Asked: May 30, 20262026-05-30T18:37:20+00:00 2026-05-30T18:37:20+00:00

I am attempting to write a line of code that will take a line

0

I am attempting to write a line of code that will take a line of japanese text and delete a certain set of characters. However I am having trouble with using unicode characters inside of the regular expression.

I am currently using text.gsub(/《.*?》/u, '') but I get the error

'gsub': invalid byte sequence in Windows-31J (Argument error)

Can anyone tell me what I am doing incorrectly?

Example text : その仕草《しぐさ》があまりに無造作《むぞうさ》だったので

Expected result: その仕草があまりに無造作だったので

Thanks

edit: # encoding: utf-8 is present at the top of the script.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T18:37:21+00:00

Editorial Team

2026-05-30T18:37:21+00:00Added an answer on May 30, 2026 at 6:37 pm

Try this:

text.encode('utf-8', 'utf-8').gsub(/《.*?》/u, '')

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am attempting to write a line of code that will take a line

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply