I’m trying to set something up where one thread is writing a list of work and another thread is reading the list and working from it. This list can be very large so to stop this list being held in memory I want to have it written in a file (or anyway of preserving memory- generators?).
I put together a little runnable example with a sleep in the writer so that the reader can catch up. I’m wondering how I can get the reader to not stop when it “overtakes” the writer. I looked at using .seek and .tell but I got weird behaviour and I’m not sure that’s the right route.
As another question, is this at all a sensible idea? Maybe there’s a much more elegant way I can queue up a list of strings without using loads of memory.
import threading,time
class Writer(threading.Thread):
lock= threading.Lock()
def __init__(self,file_path,size):
threading.Thread.__init__(self)
self.file_path= file_path
self.size= size
self.i=0
def how_many(self):
with self.lock:
print "Reader starting, writer is on",self.i
def run(self):
f=open(self.file_path,"w")
for i in xrange(self.size):
with self.lock:
self.i=i
if i%1000==0:
time.sleep(0.1)
f.write("%s\n"%i)
f.close()
class Reader(threading.Thread):
def __init__(self,file_path):
threading.Thread.__init__(self)
self.file_path= file_path
def run(self):
f=open(self.file_path,"r")
line=0
for line in f:
pass
print "Reader got to: %s"%line.strip()
if __name__ == "__main__":
a= Writer("testfile",2000000)
b= Reader("testfile")
a.start()
time.sleep(1)
a.how_many()
b.start()
I did solve this, using a buffered-file-queue where the queue is spread out between memory and file. Items are put into a Queue but if the items in the queue exceed the specified Queue size, any overflow will be stored on file to preserve memory and will be
getout the queue just the sameIf anyone is looking to do something similar I put it on github here