I have a web app that creates some HTML as a String and uploads the String as a file to Amazon S3.
Checking the String in the debugger right before the file is uploaded I see the HTML looks fine but when I check the file in the bucket I see some characters have changed. It only happens to certain characters. For example:
It’s GO time on Android™!
becomes
It’s GO time on Android™!
This is the exact code I use:
using (var client = AWSClientFactory.CreateAmazonS3Client(accessKey, secretKey, s3Config))
{
var request = new PutObjectRequest()
.WithBucketName(bucketName)
.WithKey(fileKey)
.WithMetaData("title", title);
request.ContentBody = body;
S3Response response = client.PutObject(request);
response.Dispose();
}
I tried .WithContentBody(body) but changed it to request.ContentBody = body; to see if that would magically work but obviously it didn’t.
The body variable is the HTML String. When I view it in the Debugger the characters look as they should. Also when I use the Visual Studio HTML viewer for the body value it looks fine.
Any idea what I could be doing wrong here? Am I missing some setting? Can’t seem to find any body else with this problem in my web searches.
Finally figured this one out. Didn’t spot it sooner because the HTML string I had in the debugger was rendering OK in browsers but it should have been obvious I guess.
I have to encode special characters before transporting them. I couldn’t use the standard HttpUtility to encode the entire HTML string because all the HTML tags would get encoded.
I used this method:
which I saw here.
Seems to be doing the trick so far.