- What character encoding is used by
StreamReader.ReadToEnd()? - What would be the reason to use (b) instead of (a) below?
- Is there a risk of their being a character encoding problem if (a) is used
instead of (b)? - Is there another method that is better than (a) and (b)?
(a)
Dim strWebResponse As String
Dim Request As HttpWebRequest = WebRequest.Create(Url)
Using Response As WebResponse = smsRequest.GetResponse()
Using reader As StreamReader = New StreamReader(Response.GetResponseStream())
strWebResponse = reader.ReadToEnd()
End Using
End Using
(b)
Dim encoding As New UTF8Encoding()
Dim strWebResponse As String
Dim Request As HttpWebRequest = WebRequest.Create(Url)
Using Response As WebResponse = Request.GetResponse()
Dim responseBuffer(Response.ContentLength - 1) As Byte
Response.GetResponseStream().Read(responseBuffer, 0, Response.ContentLength - 1)
strWebResponse = encoding.GetString(responseBuffer)
End Using
The standard encoding used by
StreamReaderisEncoding.Default, which will vary from machine to machine depending on your version of Windows and the locale that you have set.Encoding.UTF8.I have trouble remembering what the defaults are, so I prefer to use the
StreamReaderconstructor that lets me specify the encoding. For example:See the constructor documentation for more info.
If you use that constructor in your example a, the results will be the same as for your example b.
Should you use UTF-8? That depends on the page you’re downloading. If the page you’re downloading was encoded with UTF-8 then, yes, you should use UTF-8. UTF-8 is supposed to be the default if no character set is defined in the HTTP headers. But you need to check the
Content-Typeheader to determine if the page uses some other encoding. For example, theContent-Typeheader might read:You would have to examine the ContentType property of the
HttpWebResponse, check to see if there is acharsetfield, and set the encoding properly based on that.Or, just use UTF-8 and hope for the best.