It looks like postgres upper/lower function does not handle select characters in Turkish character set.
select upper('Aaı'), lower('Aaİ') from mytable;
returns :
AAı, aaİ
instead of :
AAI, aai
Note that normal english characters are converted correctly, but not the Turkish I (lower or upper)
Postgres version: 9.2 32 bit
Database encoding (Same result in any of these): UTF-8, WIN1254, C
Client encoding:
UTF-8, WIN1254, C
OS: Windows 7 enterprise edition 64bit
SQL functions lower and upper return the following same bytes for ı and İ on UTF-8 encoded database
\xc4b1
\xc4b0
And the following on WIN1254 (Turkish) encoded database
\xfd
\xdd
I hope my investigation is wrong, and there is something I missed.
Your problem is 100% Windows. (Or rather Microsoft Visual Studio, which PostgreSQL was built with, to be more precise.)
For the record, SQL
UPPERends up calling Windows’LCMapStringW(viatowupperviastr_toupper) with almost all the right parameters (locale 1055 Turkish for aUTF-8-encoded,Turkish_Turkeydatabase),but
the Visual Studio Runtime (
towupper) does not set theLCMAP_LINGUISTIC_CASINGbit inLCMapStringW‘s dwMapFlags. (I can confirm that setting it does the trick.) This is not considered a bug at Microsoft; it is by design, and will probably not ever be “fixed” (oh the joys of legacy.)You have three ways out of this:
MSVCR100.DLLin your PostgreSQLbindirectory (but althoughUPPERandLOWERwould work, other things such as collation may continue to fail — again, at the Windows level. YMMV.)For completeness (and nostalgic fun) ONLY, here is the procedure to patch a Windows system (but remember, unless you’ll be managing this PostgreSQL instance from cradle to grave you may cause a lot of grief to your successor(s); whenever deploying a new test or backup system from scratch you or your successor(s) would have to remember to apply the patch again — and if let’s say you one day upgrade to PostgreSQL 10, which say uses
MSVCR120.DLLinstead ofMSVCR100.DLL, then you’ll have to try your luck with patching the new DLL, too.) On a test systemC:\WINDOWS\SYSTEM32\MSVCR100.DLLbindirectory (do not attempt to copy the file using Explorer or the command line, they might copy the 64bit version)4E 14 33 DB 3B CB 0F 84 41 12 00 00 B8 00 01 00 004E 14 33 DB 3B CB 0F 84 41 12 00 00 B8 00 01 00 01FC 51 6A 01 8D 4D 08 51 68 00 02 00 00 50 E8 E2FC 51 6A 01 8D 4D 08 51 68 00 02 00 01 50 E8 E2bindirectory, then restart PostgreSQL and re-run your query.Turkish_Turkeyfor bothLC_CTYPEandLC_COLLATE) openpostgres.exein 32-bit Dependency Walker and make sure it indicates it loadsMSVCR100.DLLfrom the PostgreSQLbindirectory.bindirectory and restart.BUT REMEMBER, the moment you move the data off the Ubuntu system or off the patched Windows system to an unpatched Windows system you will have the problem again, and you may be unable to import this data back on Ubuntu if the Windows instance introduced duplicates in a
citextfield or in aUPPER/LOWER-based function index.