I have a DBF file that is encoded as Windows-ANSI (Windows Code Page 1252). I am using an ODBC driver to import this file as a table into a SQL Server database. When I do I lose some character information.
First, to verify the DBF file was encoded as expected I opened up the file with a hex editor and searched for the character in question. It’s a “small-bullet” on code page 1252, and it was stored at 0x95 in the file so, for at least that character, the encoding seems to be as expected.
I did a search and found someone saying that importing to a nvarchar as opposed to a varchar column would make a difference, so when I did the import I re-mapped the column containing the problematic character to nvarchar.
The database it is imported into has a collation of “SQL_Latin1_General_CP1_CI_AS” and from a page I read on MSDN the “CP1” indicates this should be equivalent to windows code page 1252.
When I do the import the character is being imported as 0xf2 or 0x5625. I haven’t found any reason as to why the different imports to this point.
Has anyone come across an issue such as this? What did you do to resolve it? Anything I should be looking into or trying that I haven’t as of yet?
This appears to be an issue with an old driver. Upgrading to a newer DBF driver fixed the character issue but presented another issue. The new drivers lack any “ordinal” information in the column schema, so it can’t be used with the DTS Wizard, or at least I couldn’t find a way to do so.
Installing the Microsoft Visual FoxPro OLEDB drivers worked flawlessly. Once they’re installed, they show up as a data source in the DTS Wizard, and can be used to import directly. This fixed my character issue and I was able to do my import.