I’ve got lots of text that I need to output, which includes all sorts of characters from many languages. Sometimes I need to output the text in character encodings other than Unicode (eg, Shift-JIS, or ISO-8859-2), in order to match the page it’s going to.
If the text has characters that the encoding can’t handle (eg, Japanese characters in ISO-8859-2 encoded output) I end up with odd characters in the output. I can escape them, but I’d rather do that only if it’s really necessary.
So, my question is this: Is there a way I can tell ahead of time if an encoding can handle all the characters in my string?
EDIT: I think the EncoderFallback is probably the right answer to the question I asked. Unfortunately it doesn’t seem to work in my particular situation. My thought was to convert the characters to their HTML entity equivalents (eg, モ instead of モ). However, the encoder only converts the first such character it finds, and if I set the Response.ContentEncoding it never calls my EncoderFallback at all.
You can write your own EncoderFallback class assign that to the encoder before encoding.
Using this approach you need do nothing in advanced (which likely would be simply processing the output string looking for problems).
Instead your Fallback class need only handle replacements where the encoding does not have a value for a character.