I found dozens of examples how to vectorize for loops in Python/NumPy. Unfortunately, I don’t get how I can reduce the computation time of my simple for loop using a vectorized form. Is it even possible in this case?
time = np.zeros(185000)
lat1 = np.array(([48.78,47.45],[38.56,39.53],...)) # ~ 200000 rows
lat2 = np.array(([7.78,5.45],[7.56,5.53],...)) # same number of rows as time
for ii in np.arange(len(time)):
pos = np.argwhere( (lat1[:,0]==lat2[ii,0]) and \
(lat1[:,1]==lat2[ii,1]) )
if pos.size:
pos = int(pos)
time[ii] = dtime[pos]
Probably the fastest way to find all matches is to sort both arrays and walk through them together, like this working example:
This may not be perfectly pythonic (perhaps someone can think of a nicer implementation using generators or itertools?) but it is hard to imagine any method that relies on searching one point at a time beating this in speed.