If I have a list(or array, dictionary….) in python that could exceed the available memory address space, (32 bit python) what are the options and there relative speeds? (other than not making a list that large)
The list could exceed the memory but I have no way of knowing before hand. Once it starts exceeding 75% I would like to no longer keep the list in memory (or the new items anyway), is there a way to convert to a file based approach mid-stream?
What are the best (speed in and out) file storage options?
Just need to store a simple list of numbers. no need to random Nth element access, just append/pop type operations.
If your “numbers” are simple-enough ones (signed or unsigned integers of up to 4 bytes each, or floats of 4 or 8 bytes each), I recommend the standard library array module as the best way to keep a few millions of them in memory (the “tip” of your “virtual array”) with a binary file (open for binary R/W) backing the rest of the structure on disk.
array.arrayhas very fastfromfileandtofilemethods to facilitate the moving of data back and forth.I.e., basically, assuming for example unsigned-long numbers, something like:
Of course you can add other methods as necessary (e.g. keep track of the overall length, add
extend, whatever), but ifpopandappendare indeed all you need this should serve.