This is a newbie question.
I read the following question to download a web page whose contents is coded in UTF-8. The page is then converted into a byte array, while I’m using a String to read contents from the page.
I need to turn UTF-8 into Latin1/ANSI since that’s what RichText and MessageBox seem to use (I’m getting funny characters).
Is there a more direct way to donwload a UTF-8 page and convert it into ANSI/Latin1?
Thank you.
Edit: When callig MessageBox, accented characters are not shown as expected:
Content = CStr(e.Result)
‘Théâtre, Métro
MessageBox.Show(Content)
Stringin .NET uses unicode all the way, so you should not have to convert it to something. The important thing is that when you download the page, you need to make sure that you mark that you load the data from a UTF-8 source.MSDN has a sample on loading UTF-8 encoded data into a string:
Update
When using
WebClient.DownloadStringthe conversion to a string takes place automatically and code similar to the one above is not needed. The automatic conversion uses the encoding specified byWebClient.Encoding, so the problem should be solved by setting the WebClient object’s encoding property to UTF-8: