I have a for loop that runs through a directory and processes the files there, but I’d like to only process a certain number of the files at one time. For example, I have a directory with 1000 files, but I can only process 250 of them a day, so the first time I run the script, it processes the first 250. then the next 250, and so on and so forth.
First, I’m checking the file names against an XML file that records the name of files that have already been synced, so that I don’t process them a second time. Then I would like to process the next n files, where I have a variable synclimit = n
I thought about adding the in range statement to the for loop like this:
tree = ET.parse("sync_list.xml")
root = tree.getroot()
synced = [elt.text for elt in root.findall('synced/sfile')]
for filename in os.listdir(filepath) and in range (0, synclimit) :
if fnmatch.fnmatch(filename, '*.txt') and filename not in synced:
filename = os.path.join(filepath, filename)
result = plistlib.readPlist(filename)
But, I’m pretty sure this will only check the first n number of files in the directory each time. Should I add the range statement to the if statement? like:
tree = ET.parse("sync_list.xml")
root = tree.getroot()
synced = [elt.text for elt in root.findall('synced/sfile')]
for filename in os.listdir(filepath):
if fnmatch.fnmatch(filename, '*.txt') and filename not in synced and in range (0, synclimit):
filename = os.path.join(filepath, filename)
result = plistlib.readPlist(filename)
or is there an easier way to do this? Thank you.
Just keep a separate counter and increment that, then test if it has reached
synclimit. Simple as that. There is no need to get too clever here:Alternatively, since
os.listdir()returns a list, you could filter it if you have your already synched list of filenames in a set, then slice it down to your maximum size:Note that I just test for
.endswith('.txt')instead of using your simple filematcher; the test comes down to the same thing.