I have a master file which is the one which will constantly be updated and a file which is created every minute. I want to be able to compare the new file which is created every minute to the already existing master file. So far I’ve got:
with open("jobs") as a:
new = a.readlines()
count=0
for item in new:
new[count]=new[count].split(",")
count+=1
This will allow me to compare the first index([0] of each line in my master file. Now at this point I start to confuse myself. I’m guessing it would be something along the lines of:
counter=0
for item in new:
if new[counter][0] not in master:
end = open("end","a")
end.write(str(new[counter]) + "\n")
counter+=1
end.close()
else:
REPLACE LINES THAT ALREADY EXIST IN MASTER FILE WITH NEW LINE
The IDs won’t necessarily be in the same order every time the new file comes in and the new file may contains more entries than the master file at some point.
If I haven’t made sense or missed some information out then please let me know and I’ll try and clarify. Thanks.
Sounds like a
csvproblem to me.unfortunately, it is not clear from your question, if you want to modify the masterfile itself, an out-file, or both.
this does the second (it takes a masterfile and an updatefile, both in csv format, and prints the merged thing unsorted to an out-file). If this is not what you want, or if you got data comma-seperated, but without fieldnames on top, change as you need, should be easy enough.
with master.csv as such:
and update.csv as such:
it outputs to out.csv:
Note that the order is not preserved (not clear from question if neccessary). But it is fast and clean.