I am using PostgreSQL to power a C# desktop application. When I use the PgAdmin query analyzer to update a text column with a special character (like the copyrights trademarks) it works pefectly:
update table1 set column1='value with special character ©' where column2=1
When I use this same query from my C# application, it throws an error:
invalid byte sequence for encoding
After researching this issue, I understand that .NET strings use the UTF-16 Unicode encoding.
Consider:
string sourcetext = "value with special character ©";
// Convert a string to utf-8 bytes.
byte[] utf8Bytes = System.Text.Encoding.UTF8.GetBytes(sourcetext);
// Convert utf-8 bytes to a string.
string desttext = System.Text.Encoding.UTF8.GetString(utf8Bytes);
The problem here is both the sourcetext and desttext are encoded as UTF-16 strings. When I pass desttext, I still get the exception.
I’ve also tried the following without success:
Encoder.GetString, BitConverter.GetString
Edit: I even tried this and doesn’t help:
unsafe
{
String utfeightstring = null;
string sourcetext = "value with special character ©";
Console.WriteLine(sourcetext);
// Convert a string to utf-8 bytes.
sbyte[] utf8Chars = (sbyte[]) (Array) System.Text.Encoding.UTF8.GetBytes(sourcetext);
UTF8Encoding encoding = new UTF8Encoding(true, true);
// Instruct the Garbage Collector not to move the memory
fixed (sbyte* pUtf8Chars = utf8Chars)
{
utfeightstring = new String(pUtf8Chars, 0, utf8Chars.Length, encoding);
}
Console.WriteLine("The UTF8 String is " + utfeightstring);
}
Is there a datatype in .NET that supports storing UTF-8 encoded string? Are there alternative ways to handle this situation?
As per this page from the mono project PostgreSQL they suggest that if you have errors with UTF8 strings that you can set the encoding to unicode in the connection string (if you are using the Npgsql driver):
And I have been looking in the official Npgsql docs and it isn’t mentioned.
NpgsqlConnection.ConnectionString