What is an efficient way to initialize and access elements of a large array in Python?
I want to create an array in Python with 100 million entries, unsigned 4-byte integers, initialized to zero. I want fast array access, preferably with contiguous memory.
Strangely, NumPy arrays seem to be performing very slow. Are there alternatives I can try?
There is the array.array module, but I don’t see a method to efficiently allocate a block of 100 million entries.
Responses to comments:
- I cannot use a sparse array. It will be too slow for this algorithm because the array becomes dense very quickly.
- I know Python is interpreted, but surely there is a way to do fast array operations?
- I did some profiling, and I get about 160K array accesses (looking up or updating an element by index) per second with NumPy. This seems very slow.
I have done some profiling, and the results are completely counterintuitive.
For simple array access operations, numpy and array.array are 10x slower than native Python arrays.
Note that for array access, I am doing operations of the form:
Profiles:
[0] * 20000000
numpy.zeros(shape=(20000000,), dtype=numpy.int32)
array.array(‘L’, [0] * 20000000)
array.array(‘L’, (0 for i in range(20000000)))