Having an UTF-8 string like this:
mystring = "işğüı"
is it possible to get its (in memory) size in Bytes with Python (2.5)?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
Assuming you mean the number of UTF-8 bytes (and not the extra bytes that Python requires to store the object), it’s the same as for the length of any other string. A string literal in Python 2.x is a string of encoded bytes, not Unicode characters.
Byte strings:
Unicode strings:
It’s good practice to maintain all of your strings in Unicode, and only encode when communicating with the outside world. In this case, you could use
len(myunicode.encode('utf-8'))to find the size it would be after encoding.