I have an encoding question and would like to ask for help. I notice

Question

0

Asked: May 23, 20262026-05-23T03:40:32+00:00 2026-05-23T03:40:32+00:00

I have an encoding question and would like to ask for help. I notice

0

I have an encoding question and would like to ask for help. I notice if I choose “UTF-8” as encoding, there are (at least) two double quotes " and “. But when I choose “ISO-8859-1” as the encoding, I see the latter double quote becomes ¡°, or sometimes for example â€œ.

Could anyone please explain why this is the case? How can match “ and replace it with " using regexp in perl?

Thanks a lot.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T03:40:33+00:00

ISO-8859-1 is a one-byte-per-character encoding. The fancy Unicode double-quotes are not in the ISO-8859-1 character set. So what you are seeing is a multi-byte character represented as a sequence of ISO-8859-1 bytes.

To match these weird things, see the perlunicode man page, especially the \x{…} and \N{…} escape sequences.

To answer your question, try \x{201C} to match the Unicode LEFT DOUBLE QUOTATION MARK and \x{201D} to match the RIGHT DOUBLE QUOTATION MARK. You missed the latter in your question :-).

[update]

I should have provided my reference… Some nice gentleman in the UK has a page on ASCII and Unicode quotation marks. The plain vanilla ASCII/ISO-8859-1 double-quote is just called QUOTATION MARK.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have an encoding question and would like to ask for help. I notice

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply