I’m trying to speed up the following code, where given a list of strings

Question

0

Asked: June 4, 20262026-06-04T10:47:41+00:00 2026-06-04T10:47:41+00:00

I’m trying to speed up the following code, where given a list of strings

0

I’m trying to speed up the following code, where given a list of strings str_list I’m trying to convert the string into a number (unpack) and assign this number into the correct position of the nested list data. The dimensions of data are roughly data[4][20][1024]. Unfortunately, this function runs very slowly. Here’s the code:

for abs_idx in range(nbr_elements):

    # get string
    this_element = str_list[abs_idx]

    # convert into number
    this_element = unpack('d', this_element)[0]

    # calculate the buffer number
    buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT

    # calculate the position inside the buffer
    index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT

    # write data into correct position
    data[file_idx][buffer_nbr][index_in_buffer] = this_element

I also tried the following alternative solution, which is even slower:

# convert each string into a number
unpacked_values = [unpack('d', str_list[j])[0] for j in range(nbr_elements)]
for abs_idx in range(nbr_elements):

    # calculate the buffer number
    buffer_nbr = abs_idx / NBR_DATA_POINTS_PER_BUFFER_INT

    # calculate the position inside the buffer
    index_in_buffer = abs_idx % NBR_DATA_POINTS_PER_BUFFER_INT

    # write data into correct position
    data[file_idx][buffer_nbr][index_in_buffer] = unpacked_values[abs_idx]

To my surprise, the next implementation is the slowest (I expected it to be the fastest):

# convert each string into a number
unpacked_values = [unpack('d', str_list[j])[0] for j in range(nbr_elements)]

# calculate all buffer numbers at once
buffer_ids = np.arange(nbr_elements) / NBR_DATA_POINTS_PER_BUFFER_INT

# calculate all positions inside the buffer at once
index_in_buffer_id = np.arange(nbr_elements) % NBR_DATA_POINTS_PER_BUFFER_INT

for abs_idx in range(nbr_elements):
    data[file_idx][buffer_ids[abs_idx]][index_in_buffer_id[abs_idx]] = unpacked_values[abs_idx]

Why are the successive implementations performing worse? Where are the individual bottlenecks? And how can I speed up my initial code?

EDIT: from my profiling tests, the following two steps are the bottleneck: running unpack and assigning the value to data. I don’t know though how to speed-up these steps.

EDIT2: I need to use unpack because my strings are in hex.

EDIT3: values = unpack("d" * n, "".join(str_list)) solves the problem with unpack being slow. Still, the assignment to data with the triple (original) or double (modified) nested loop eats up 50% of the time. Is there a way to reduce this time?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T10:47:44+00:00

Some optimizations:

Unpack all stings at once
Get item data[file_idx] before loop

Try it:

n = len(str_list)
values = unpack("d" * n, "".join(str_list))

a = data[file_idx]

# Just to shorten this code sample
q = NBR_DATA_POINTS_PER_BUFFER_INT

for i in xrange(n):
    a[i / q][i % q] = values[i]

Btw, did you profile what part of the code takes the most time?

UPDATE:

n = len(str_list)
values = unpack("d" * n, "".join(str_list))

# Just to shorten this code sample
q = NBR_DATA_POINTS_PER_BUFFER_INT

data[file_idx] = [values[i:i+q] for i in xrange(0, n, q)]

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to speed up the following code, where given a list of strings

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply