I have a string in Python 2.7.2 say u\u0638. When I write it to

Question

Asked: June 6, 20262026-06-06T10:37:06+00:00 2026-06-06T10:37:06+00:00

I have a string in Python 2.7.2 say u”\u0638″.
When I write it to file:

f = open("J:\\111.txt", "w+")
f.write(u"\u0638".encode('utf-16'))
f.close()

In hex it looks like: FF FE 38 06
When i print such a string to stdout i will see: ‘\xff\xfe8\x06’.

The querstion: Where is \x38 in the string output to stdout? In other words why the string output to stdout is not ‘\xff\xfe\x38\x06’?

If I write the string to file twice:

f = open("J:\\111.txt", "w+")
f.write(u"\u0638".encode('utf-16'))
f.write(u"\u0638".encode('utf-16'))
f.close()

The hex representation in file contains byte order mark (BOM) \xff\xfe twice: FF FE 38 06 FF FE 38 06

I wonder what is the techique to avoid writting BOM in UTF-16 encoded strings?

You must login to add an answer.

Need An Account,

Editorial Team · Answer 1 · 2026-06-06T10:37:08+00:00

Editorial Team

The ASCII character 8 has hex representation 0x38. So your string:

\xff\xfe8\x06

is four bytes long. Separated by spaces, the bytes are:

\xff \xfe 8 \x06

Python uses the \x notation for bytes that do not represent printable ASCII characters.

The Archive Base Latest Questions