At some point our python script receives string like that: In [1]: ab =

Question

0

Asked: June 12, 20262026-06-12T20:55:55+00:00 2026-06-12T20:55:55+00:00

At some point our python script receives string like that: In [1]: ab =

0

At some point our python script receives string like that:

In [1]: ab = 'asd\xeffe\ctive'

In [2]: print ab
asd�fe\ctve \ \\ \\\k\\\

Data is damaged we need escape \x to be properly interpreted as \x but \c has not special meaning in string thus must be intact.

So far the closest solution I found is do something like:

In [1]: ab = 'asd\xeffe\ctve \\ \\\\ \\\\\\k\\\\\\'

In [2]: print ab.encode('string-escape').replace('\\\\', '\\').replace("\\'", "'")

asd\xeffe\ctve \ \\ \\\k\\\

Output taken from IPython, I assumed that ab is a string not unicode string (in the later case we would have to do something like that:

def escape_string(s):
    if isinstance(s, str):
        s = s.encode('string-escape').replace('\\\\', '\\').replace("\\'", "'")
    elif isinstance(s, unicode):
        s = s.encode('unicode-escape').replace('\\\\', '\\').replace("\\'", "'")
    return s

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-12T20:55:56+00:00

'\\' is the same as '\x5c'. It is just two different ways to write the backslash character as a Python string literal.

These literal strings: r'\c', '\\c', '\x5cc', '\x5c\x63' are identical str objects in memory.

'\xef' is a single byte (239 as an integer), but r'\xef' (same as '\\xef') is a 4-byte string: '\x5c\x78\x65\x66'.

If s[0] returns '\xef' then it is what s object actually contains. If it is wrong then fix the source of the data.

Note: string-escape also escapes \n and the like:

>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('unicode-escape')
\xef\\c\\\u2603"'\u2603\u2603"'\n\xa0
>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('string-escape')
\xef\\c\\\\N{SNOWMAN}"\'\xe2\x98\x83\\u2603"\'\n\xa0

backslashreplace is used only on characters that cause UnicodeEncodeError:

>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''
ï\c\☃"'☃☃"'

>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''
�\c\\N{SNOWMAN}"'☃\u2603"'
�
>>> print u'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.encode('ascii', 'backslashreplace')
\xef\c\\u2603"'\u2603\u2603"'
\xa0
>>> print b'''\xef\c\\\N{SNOWMAN}"'\
... ☃\u2603\"\'\n\xa0'''.decode('latin1').encode('ascii', 'backslashreplace')
\xef\c\\N{SNOWMAN}"'\xe2\x98\x83\u2603"'
\xa0

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

At some point our python script receives string like that: In [1]: ab =

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply