I’m trying to delete some files with unicode characters in them with batch script (it’s a requirement). So I run cmd and execute:
> chcp 65001
Effectively setting codepage to UTF-8. And it works:
D:\temp\1>dir
Volume in drive D has no label.
Volume Serial Number is 8C33-61BF
Directory of D:\temp\1
02.02.2010 09:31 <DIR> .
02.02.2010 09:31 <DIR> ..
02.02.2010 09:32 508 1.txt
02.02.2010 09:28 12 delete.bat
02.02.2010 09:20 95 delete.cmd
02.02.2010 09:13 <DIR> Rún
02.02.2010 09:13 <DIR> Гуцул Каліпсо
3 File(s) 615 bytes
4 Dir(s) 11 576 438 784 bytes free
D:\temp\1>rmdir Rún
D:\temp\1>dir
Volume in drive D has no label.
Volume Serial Number is 8C33-61BF
Directory of D:\temp\1
02.02.2010 09:56 <DIR> .
02.02.2010 09:56 <DIR> ..
02.02.2010 09:32 508 1.txt
02.02.2010 09:28 12 delete.bat
02.02.2010 09:20 95 delete.cmd
02.02.2010 09:13 <DIR> Гуцул Каліпсо
3 File(s) 615 bytes
3 Dir(s) 11 576 438 784 bytes free
Then I put the same rmdir commands in batch script and save it in UTF-8 encoding. But when I run nothing happens, literally nothing: not even echo works from batch script in this case. Even saving script in OEM encoding does not help.
So it seems that when I change codepage to UTF-8 in console, scripts just stop working. Does somebody know how to fix that?
If you want to have unicode supported in batch file, then CHCP on a line by itself just aborts the batch file. What I suggest is putting CHCP on each batch file line that needs unicode as follows
Example: In my case I wanted to have a nice TAIL of my log files while debugging, but the content for even Latin-1 characters was being messed up. So here is my batch file which wraps the real tail implementation from Windows Resource Kit.
In addition, for output to a console, you need to set a true type font, i.e. Lucidia Console.
And apparently for output to a file the command line needs to run as Unicode, so you would kick off your batch script as follows
Disclaimer: Tested on Windows XP sp3 with Windows Resource Kit.