I’d like to dump a simple Unicode string into an array of bytes, so I can refer to each as an int. Is this possible?
I’m looking to take a string u"Hello World" and convert it into UTF-8 and something that looks like this: `
[0x01, 0x02, ..., 0x02]
How can I do this efficiently?
Your question could mean two things: either encode the Unicode string using, say, UTF8 and getting a list of the resultant bytes, or getting a list of Unicode code points.
In the former case:
In the latter case: