I’m using python and numpy to compare two arrays or equal shape with coordinates (x,y,z) in order to match them, which look like that:
coordsCFS
array([[ 0.02 , 0.02 , 0. ],
[ 0.03 , 0.02 , 0. ],
[ 0.02 , 0.025 , 0. ],
...,
[ 0.02958333, 0.029375 , 0. ],
[ 0.02958333, 0.0290625 , 0. ],
[ 0.02958333, 0.0296875 , 0. ]])
and
coordsRMED
array([[ 0.02 , 0.02 , 0. ],
[ 0.02083333, 0.02 , 0. ],
[ 0.02083333, 0.020625 , 0. ],
...,
[ 0.03 , 0.0296875 , 0. ],
[ 0.02958333, 0.03 , 0. ],
[ 0.02958333, 0.0296875 , 0. ]])
The data are read from two hdf5 files with h5py.
For the comparison I use allclose, which tests for “almost equality”. The coordinates do not match within python’s regular floating point precision. This is the reason I used the for loops, otherwise it would have worked with numpy.where. I usually try to avoid for loops, but in this context I couldn’t figure out how. So I came up with this surprisingly slow snippet:
mapList = []
for cfsXYZ in coordsCFS:
# print cfsXYZ
indexMatch = 0
match = []
for asterXYZ in coordRMED:
if numpy.allclose(asterXYZ,cfsXYZ):
match.append(indexMatch)
# print "Found match at index " + str(indexMatch)
# print asterXYZ
indexMatch += 1
# check: must only find one match.
if len(match) != 1:
print "ERROR matching"
print match
print cfsXYZ
return 1
# save to list
mapList.append(match[0])
if len(mapList) != coordsRMED.shape[0]:
print "ERROR: matching consistency check"
print mapList
return 1
This is very slow for my test sample size (800 rows). I plan to compare much larger sets. I could remove the consistency check and use break in the inner for loop for some speed benefit. Is there still a better way?
You could get rid of the inner loop with something like this: