I am writing a piece of code for a SSIS 2005 script component to read data from an informix database (where the database strings are stored as UTF8).
The output of this string needs to be loaded to a text stream (DT_TEXT), encoded using code page 1252 (ANSI – Latin I).
Here’s a simple example of what I am trying to accomplish (AllColumnsBuffer is the script component output buffer, ColumnText is the name of the DT_TEXT field I am loading).
Dim s As String = "Testing,1,2,3" & System.Text.RegularExpressions.Regex.Unescape("\u4EB5")
AllColumnsBuffer.AddRow()
AllColumnsBuffer.ColumnText.AddBlobData(encoding.GetEncoding(1252).GetBytes(s))
I need to throw an error if the encoding finds characters that cannot be converted to 1252. It seems that now it just puts in a ? if a character in the source doesn’t exist. Is there any way to validate that the character exists in the target code page?
You could create a new encoding and then set the
EncoderFallbackproperty – either to your own fallback, or if an exception is good enough for you then you can useEncoderExceptionFallback. Your own fallback might (say) fail gracefully without an exception, but set a flag to tell you afterwards that it failed.