I currently process sections of a string like this: for (i, j) in huge_list_of_indices:

Question

0

Asked: May 27, 20262026-05-27T05:21:16+00:00 2026-05-27T05:21:16+00:00

I currently process sections of a string like this: for (i, j) in huge_list_of_indices:

0

I currently process sections of a string like this:

for (i, j) in huge_list_of_indices:
    process(huge_text_block[i:j])

I want to avoid the overhead of generating these temporary substrings. Any ideas? Perhaps a wrapper that somehow uses index offsets? This is currently my bottleneck.

Note that process() is another python module that expects a string as input.

Edit:

A few people doubt there is a problem. Here are some sample results:

import time
import string
text = string.letters * 1000

def timeit(fn):
    t1 = time.time()
    for i in range(len(text)):
        fn(i)
    t2 = time.time()
    print '%s took %0.3f ms' % (fn.func_name, (t2-t1) * 1000)

def test_1(i):
    return text[i:]

def test_2(i):
    return text[:]

def test_3(i):
    return text

timeit(test_1)
timeit(test_2)
timeit(test_3)

Output:

test_1 took 972.046 ms
test_2 took 47.620 ms
test_3 took 43.457 ms

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T05:21:16+00:00

I think what you are looking for are buffers.

The characteristic of buffers is that they “slice” an object supporting the buffer interface without copying its content, but essentially opening a “window” on the sliced object content. Some more technical explanation is available here. An excerpt:

Python objects implemented in C can export a group of functions called the “buffer interface.” These functions can be used by an object to expose its data in a raw, byte-oriented format. Clients of the object can use the buffer interface to access the object data directly, without needing to copy it first.

In your case the code should look more or less like this:

>>> s = 'Hugely_long_string_not_to_be_copied'
>>> ij = [(0, 3), (6, 9), (12, 18)]
>>> for i, j in ij:
...     print buffer(s, i, j-i)  # Should become process(...)
Hug
_lo
string

HTH!

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I currently process sections of a string like this: for (i, j) in huge_list_of_indices:

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply