My program is not doing a great job. In a loop, data from each processor (list of tuple) are gathered into the master processor that needs to clean it by removing similar element.
I found a lot of interesting clue on internet and especially in this site about union of list. However, i have not managed to apply it to my problem.
My aim is to get rid of tuple whose its two last element are similar to another tuple in the list . for example:
list1=[[a,b,c],[d,e,f],[g,h,i]]
list2=[[b,b,c],[d,e,a],[k,h,i]]
the result should be:
final=[[a,b,c],[d,e,f],[g,h,i],[d,e,a]]
Right now I’m using loops and break but I’m hoping to make this process faster.
here is what my code looks like (result and temp are the lists I want to get union from)
on python2.6.
for k in xrange(len(temp)):
u=0
#index= next((j for j in xrange(lenres) if temp[k][1:3] == result[j][1:3]),None)
for j in xrange(len(result)):
if temp[k][1:3] == result[j][1:3]:
u=1
break
if u==0:
#if index is None:
result.append([temp[k][0],temp[k][1],temp[k][2]])
Thanks for your help
Herve
Below is our uniques function. It takes arguments l (list) and f (function), returns list with duplicates removed (in the same order). Duplicates are defined by: b is duplicate of a iff f(b) == f(a).
We define lastTwo as follows:
For your problem we use it as follows:
If the usecase you describe comes up a lot you may want to define
Identity is our default f but it can be anything (so long as it is defined for all elements of the list, since they can be of any type).
f can be sum of a list, odd elements of a list, prime factors of an integer, anything. (Just remember that if its injective theres no point! Add by constant, linear functions, etc will work no differently than identity bc its f(x) == f(y) w/ x != y that makes the difference)
You seemed concerned about speed, but my answer really focused on shortness/flexibility without significant (or any?) speed improvment. For bigger lists where speed is z concern, you want to take advantage of hashable checks and the fact that list1 and list2 are known to have no duplicates
OR allowing disordering
–Edit–
You should always be able to avoid un-pythonic looping like you post in your question. Here is your exact code with the loops changed:
result and temp are iterable, and for anything iterable you can put it directly in the for loop without eanges. If for some reason you explicitly need the index (this is not such a case, but I have one above) you can use enumerate.