I’m creating large file with my python script (more than 1GB, actually there’s 8 of them). Right after I create them I have to create process that will use those files.
The script looks like:
# This is more complex function, but it basically does this:
def use_file():
subprocess.call(['C:\\use_file', 'C:\\foo.txt']);
f = open( 'C:\\foo.txt', 'wb')
for i in 10000:
f.write( one_MB_chunk)
f.flush()
os.fsync( f.fileno())
f.close()
time.sleep(5) # With this line added it just works fine
t = threading.Thread( target=use_file)
t.start()
But application use_file acts like foo.txt is empty. There are some weird things going on:
- if I execute
C:\use_file C:\foo.txtin console (after script finished) I get correct results - if I execute manually
use_file()in another python console I get correct results C:\foo.txtis visible on disk right afteropen()was called, but remains size0Buntil the end of script- if I add
time.sleep(5)it just starts working as expected (or rather required)
I’ve already found:
os.fsync()but it doesn’t seem to work (result fromuse_fileis as ifC:\foo.txtwas empty)- Using
buffering=(1<<20)(when opening file) doesn’t seem to work either
I’m more and more curious about this behaviour.
Questions:
- Does python fork
close()operation into background? Where is this documented? - How to work this around?
- Am I missing something?
- After adding
sleep: is that a windows/python bug?
Notes: (for the case that there’s something wrong with the other side) application use_data uses:
handle = CreateFile("foo.txt", GENERIC_READ, FILE_SHARE_READ, NULL,
OPEN_EXISTING, 0, NULL);
size = GetFileSize(handle, NULL)
And then processes size bytes from foo.txt.
f.close()callsf.flush(), which sends the data to the OS. That doesn’t necessarily write the data to disk, because the OS buffers it. As you rightly worked out, if you want to force the OS to write it to disk, you need toos.fsync().Have you considered just piping the data directly into
use_file?EDIT: you say that
os.fsync()‘doesn’t work’. To clarify, if you doand then look at the file on disk, does it have data?