What is the fastest method for converting a binary data string to a numeric value in Python?
I am using struct.unpack_from(), but am hitting a performance limit.
Context: an incoming stream is mixed binary and ASCII data. The ASCII data conversion is done in C though ctypes. Implementing the unpacking in C through ctypes yielded similar performance to unpack. My guess is the call overhead was too much of a factor. I was hoping to find a native C-like coercion method (however un-Pythonic). Most likely all of this code will need to move to C.
The stream is in network byte order (big-endian) and the machine is little-endian. An example conversion would be:
import struct
network_stream = struct.pack('>I', 0x12345678)
(converted_int,) = struct.unpack_from('>I', network_stream, 0)
I am less concerned about handling the stream format, than the general case of binary conversion, and if there is even an alternative to unpack. For example, socket.ntohl() requires an int, and int() won’t convert a binary data string.
Thanks for your suggestions!
The speed problem probably comes not in the implementation of
struct.unpack_from()itself, but in everything else Python needs to do—dictionary lookups, create objects, call functions, and other tasks. You can speed things up ever so slightly by eliminating one of these dictionary lookups by importingunpack_fromdirectly rather than getting it from thestructmodule each time:However, if there needs to be a lot of parsing logic that necessitates unpacking one number at a time, and will keep you from unpacking a whole array of data in bulk, it doesn’t matter what you call to do it for you. You are probably going to need to do this whole inner loop in a language with less overhead, such as C.