I’m trying to speed up the following code, where given a list of strings str_list I’m trying to convert the string into a number (unpack) and assign this number into the correct position of the nested list data. The dimensions of data are roughly data[4][20][1024]. Unfortunately, this function runs very slowly. Here’s the code:
for abs_idx in range(nbr_elements):
# get string
this_element = str_list[abs_idx]
# convert into number
this_element = unpack('d', this_element)[0]
# calculate the buffer number
buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT
# calculate the position inside the buffer
index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT
# write data into correct position
data[file_idx][buffer_nbr][index_in_buffer] = this_element
I also tried the following alternative solution, which is even slower:
# convert each string into a number
unpacked_values = [unpack('d', str_list[j])[0] for j in range(nbr_elements)]
for abs_idx in range(nbr_elements):
# calculate the buffer number
buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT
# calculate the position inside the buffer
index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT
# write data into correct position
data[file_idx][buffer_nbr][index_in_buffer] = unpacked_values[abs_idx]
To my surprise, the next implementation is the slowest (I expected it to be the fastest):
# convert each string into a number
unpacked_values = [unpack('d', str_list[j])[0] for j in range(nbr_elements)]
# calculate all buffer numbers at once
buffer_ids = np.arange(nbr_elements) / NBR_DATA_POINTS_PER_BUFFER_INT
# calculate all positions inside the buffer at once
index_in_buffer_id = np.arange(nbr_elements) % NBR_DATA_POINTS_PER_BUFFER_INT
for abs_idx in range(nbr_elements):
data[file_idx][buffer_ids[abs_idx]][index_in_buffer_id[abs_idx]] = unpacked_values[abs_idx]
Why are the successive implementations performing worse? Where are the individual bottlenecks? And how can I speed up my initial code?
EDIT: from my profiling tests, the following two steps are the bottleneck: running unpack and assigning the value to data. I don’t know though how to speed-up these steps.
EDIT2: I need to use unpack because my strings are in hex.
EDIT3: values = unpack("d" * n, "".join(str_list)) solves the problem with unpack being slow. Still, the assignment to data with the triple (original) or double (modified) nested loop eats up 50% of the time. Is there a way to reduce this time?
Some optimizations:
Try it:
Btw, did you profile what part of the code takes the most time?
UPDATE: