Background I have two lists, the first is items which contains around 250 tuples,

Question

0

Editorial Team

Asked: June 6, 20262026-06-06T08:10:21+00:00 2026-06-06T08:10:21+00:00

Background I have two lists, the first is items which contains around 250 tuples,

0

Background

I have two lists, the first is items which contains around 250 tuples, each tuple contains 3 elements

(path_to_a_file, size_in_bytes, modified_time)

The second list, result contains anywhere up to 250 elements, which is the result of a database query which looks up rows based on the paths that are in the items list. The number of elements in result depends if those files are in the database already.

each element in result is an row object returned from SQLAlchemy query with attributes for the row values, (path, mtime and hash are the ones I’m interested in here)

What I’m trying and do is filter out all the elements in items that are in results that have the same mtime (and keep track of the number, and total size filtered) and make a new list with items either with a different mtime or that dont exist in result. items with different mtimes need to be stored (path,size,mtime_from_result,hash_from_result) and items which weren’t in the database (path,size,mtime,None).

I hope I’m not making this too localised but I felt I needed to explain what I’m trying to accomplish to ask the question.

Problem

I want to try and make this loop as fast as possible but the most important part is making it work as expected.

Is it safe to remove items from the lists as I iterate over them? I noticed iterating forwards has a weird outcome but iterating backwards seems to be ok. Is there a better approach?

I’m removing items that I’ve matched up (i.path == j[0]) because I know the relationship is 1 to 1 and its not going to match again so by reducing the lists I can iterate over it faster in the next iteration, and more importantly I get left with all the unmatched items.

I can’t help feel there’s a much nicer solution that I’m overlooking, perhaps with list comprehension or generators perhaps.

send_items=[]
for i in result[::-1]:
    for j in items[::-1]:
        if i.path==j[0]:
            result.remove(i) #I think this remove is possibly pointless?
            items.remove(j)
            if i.mtime==j[2]:
                self.num_skipped+=1
                self.size_skipped+=j[1]
            else:
                send_items.append((j[0],j[1],i.mtime,i.hash))
            break
send_items.extend(((j[0],j[1],j[2],None) for j in items))

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T08:10:23+00:00

I’d do this as:

def get_send_items(items, results):
    send_items = []
    results_dict = {i.path:i for i in results}
    for p, s, m in items:
        result = results_dict.get(p)
        if result is None:
            send_items.append((p, s, m, None))
        elif result.mtime != m:
            send_items.append((p, s, result.mtime, result.hash))
    return send_items

Here is my analysis of your solution (Assuming both result and items are of length N):

result[::-1] creates a copy of result so calling result.remove(i) doesn’t affect the iteration, nor would it have anyways. You only loop over result once, so removing elements is a bit pointless. It only creates extra work.
You could have called result[::] to create a copy of result.
Calling items.remove(j) actually reduces efficiency. remove() takes O(N) time. So calling it reduces the algorithm’s efficiency to O(N^3) from O(N^2).
By using O(N) extra memory (as in my solution) you can reduce the run time to O(N), if you use a dictionary or a set that has O(1) look ups.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Background I have two lists, the first is items which contains around 250 tuples,

Background

Problem

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply