I have a list of approximately 200 000+ objects, each one representing a file (but not actually holding the file’s contents, just the full path name and date).
The program I am writing copies any subset of these files, depending on the user-provided date range. I first create a list of all of the files in the source directory (with the glob module), create an instance of my file-representation class and add that instance to a list, like so:
for f in glob.glob(srcdir + "/*.txt"):
LOG_FILES.append(LogFile(f))
Now, to keep the copying of files quick and the block of code clean, I remove the LogFile objects that do not fit inside of the date range.
for i in xrange(0, len(LOG_FILES)):
if LOG_FILES[i].DATE < from_date or LOG_FILES[i].DATE > to_date:
del(LOG_FILES[i])
Afterwards, I can just copy the files that are left in the list:
for logfile in LOG_FILES:
os.copy(logfile.PATH, destdir)
The issue occurs with the for i in xrange... example: I get thrown an IndexError when the value of i gets to 63792.
IndexError: list index out of range.
Any ideas?
EDIT Thank you very much for the quick responses! Now that I think about it, it was a silly oversight on my part. Again, thank you, everyone. 🙂
[EDIT] Oops, I forgot to invert the “<” and “>” and add an ‘equals’ sign.
This can replace the whole initalization of LOG_FILES. It’s a list comprehension (if you wish you can make it a generator (which doesn’t get evaluated until it’s enumerated) by replacing the [ ] with ( ). That might be more efficient depending on what you do with it.
You need to do this because editing a collection while enumerating it isn’t allowed. (see above, far more eloquent answers).
You can read the expression above like this:
“create a list (or enumerable) of the result of LogFile, when it’s handed ‘f’ for each f in ‘glob.glob(…)’ but only if the ‘if’ statement is true.”
See: The List Comprehension section of that link.