I have a website that tells the output is UTF-8, but I never make

Question

0

Asked: May 11, 20262026-05-11T23:42:42+00:00 2026-05-11T23:42:42+00:00

I have a website that tells the output is UTF-8, but I never make

0

I have a website that tells the output is UTF-8, but I never make sure that it is. Should I use a regular expression or Iconv library to convert UTF-8 to UTF-8 (leaving invalid sequences)? Is this a security issue if I do not do it?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-11T23:42:43+00:00

First of all I would never just blindly encode it as UTF-8 (possibly) a second time because this would lead to invalid chars as you say. I would certainly try to detect if the charset of the content is not UTF-8 before attempting such a thing.

Secondly if the content in question comes from a source wich you have control over and control the charset for such as a file with UTF-8 or a database with UTF-8 in use in the tables and on the connection, I would trust that source unless something gives me hints that I can’t and there is something funky going on. If the content is coming from more or less random places outside your control, well all the more reason to inspect it and possibly try to re-encode og transform from other charsets if you can detect it. So the bottom line is: It depends.

As to wether this is a security issue or not I wouldn’t think so (at least I can’t think of any scenarios where this could be exploitable) but I’ll leave to others to be definitive about that.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a website that tells the output is UTF-8, but I never make

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply